Katana VentraIP

Classification


Classification is usually understood to mean the allocation of objects to certain pre-existing classes or categories. This distinguishes it from clustering in which similar objects are grouped together, thereby creating new classes.[1] Examples include a pregnancy test and identifying spam emails.

For information about Wikipedia's categories, see Help:Category and Wikipedia:Categorization.

Classification is a part of many different kinds of activities and studied from many different points of view including philosophy, law, anthropology, biology, taxonomy, cognition, communications, knowledge organization, psychology, statistics, machine learning, librarianship and mathematics.


As well as 'category', synonyms or near-synonyms for 'class' include 'type', 'species', 'order', 'concept', 'taxon', 'group' and 'division'. Equally, the meaning of the word 'classification' (and its synonyms) may in day-to-day usage take on one of several related meanings: it may encompass both classification and the creation of classes, as for example in 'the task of categorizing pages in Wikipedia'; or it may refer to the underlying scheme of classes; or it may refer to the label given to an object by the classifier.

Evaluation of accuracy[edit]

Unlike in decision theory, it is assumed that a classifier repeats the classification task over and over. And unlike a lottery, it is assumed that each classification can be either right or wrong; in the theory of measurement, classification is understood as measurement against a nominal scale. Thus it is possible to try to measure the accuracy of a classifier.


Measuring the accuracy of a classifier allows a choice to be made between two alternative classifiers. This is important both when developing a classifier and in choosing which classifier to deploy. There are however many different methods for evaluating the accuracy of a classifier and no general method for determining which method should be used in which circumstances. Different fields have taken different approaches even when considering the simplest form of classification, binary classification, where only two classes are involved. In pattern recognition, error rate is popular. The Gini coefficient and KS statistic are widely used in the credit scoring industry. Sensitivity and specificity are widely used in epidemiology and medicine. Precision and recall are widely used in information retrieval.[2]


Classifier accuracy depends greatly on the characteristics of the data to be classified. There is no single classifier that works best on all given problems (a phenomenon that may be explained by the no-free-lunch theorem).

Classification in communications theory[edit]

Frederick Suppe[3] distinguished two senses of classification: a broad meaning, which he called "conceptual classification" and a narrow meaning, which he called "systematic classification".


About conceptual classification Suppe wrote:[3]: 292  "Classification is intrinsic to the use of language, hence to most if not all communication. Whenever we use nominative phrases we are classifying the designated subject as being importantly similar to other entities bearing the same designation; that is, we classify them together. Similarly the use of predicative phrases classifies actions or properties as being of a particular kind. We call this conceptual classification, since it refers to the classification involved in conceptualizing our experiences and surroundings"


About systematic classification Suppe wrote:[3]: 292  "A second, narrower sense of classification is the systematic classification involved in the design and utilization of taxonomic schemes such as the biological classification of animals and plants by genus and species.

Examples of important classification systems[edit]

Periodic table[edit]

The periodic table is the classification of the chemical elements which is in particular associated with Dmitri Mendeleev (cf., History of the periodic table). An authoritative work on this system is Scerri (2020).[21] Hubert Feger (2001; numbered listing added) wrote about it:[22]: 1967–1968  "A well-known, still used, and expanding classification is Mendeleev's Table of Elements. It can be viewed as a prototype of all taxonomies in that it satisfies the following evaluative criteria:

Philosophical issues[edit]

Artificial versus natural classification[edit]

Natural classification is a concept closely related to the concept natural kind. Carl Linnaeus is often recognized as the first scholar to clearly have differentiated "artificial" and "natural" classifications[33][34] A natural classification is one, using Plato's metaphor, that is “carving nature at its joints”[35] Although Linnaeus considered natural classification the ideal, he recognized that his own system (at least partly) represented an artificial classification.


John Stuart Mill explained the artificial nature of the Linnaean classification and suggested the following definition of a natural classification:

Classification of customers, for marketing (as in ) or for profitability (e.g. by Activity-based costing)

Master data management

as in legal or government documentation

Classified information

Job classification, as in

job analysis

economic activities

Standard Industrial Classification

International Society for Knowledge Organization

Class (disambiguation)

Classified (disambiguation)

Classifier (disambiguation)

Data classification (disambiguation)

Categorization

Classification theorem

Folk taxonomy

Taxonomy

Media related to Classification at Wikimedia Commons

. In The Internet Encyclopedia of Philosophy eds. James Fieser and Bradley Dowden.

Parrochia, Daniel 2016. "Classification"