Full-text resources of CEJSH and other databases are now available in the new Library of Science.
Visit https://bibliotekanauki.pl

Results found: 5

first rewind previous Page / 1 next fast forward last

Search results

Search:
in the keywords:  categorical data
help Sort By:

help Limit search:
first rewind previous Page / 1 next fast forward last
1
100%
EN
Visualization is one of the most important parts of statistical analysis. In this paper we present a new method of multiple bar charts to display the frequencies of data tables split up into conditional relative frequencies of one target variable and the absolute frequencies of the corresponding combinations of the remaining explanatory variables. In this paper we present the R package extracat allowing for new graphical tools: rmp and cpcp plot [Pilhoefer, Unwin 2013]. The first plot uses the a crossover of mosaicplots and multiple barcharts to display the frequencies of a contingency table split up into conditional relative frequencies of one target variable and the absolute frequencies of the corresponding combination of the remaining explanatory variables. It provides a well-structured representation of the data with the possibility of easy interpretation. Another plot presented in the paper is the cpcp plot using parallel coordinates. Sequences of points are used to represent each of the variable categories, while ordering algorithms are applied to represent a hierarchical structure in the dataset.
EN
The paper focuses on latent class models and their application for quantitative data. Latent class modeling is one of multivariate analysis techniques of the contingency table and can be viewed as a special case of model-based clustering, for multivariate discrete data. It is assumed that each observation comes from one of the numbers of subpopulations, with its own probability distribution. We used latent class analysis for grouping and detecting homogeneity of Silesian people using poLCA package of R. We analyzed data collected by the Department of Social Pedagogy, University of Silesia in Katowice.
EN
The article presents the problem of the co-occurrences of opinions on important non-financial work aspects among people from different age groups reflecting various stages of professional development: career start, phase of attainments, stage of preserving attainments and a rest phase. The study is based on non-metric survey data, which determines the use of adequate research methods. Association analysis is applied in order to identify patterns of responses. The revealed association rules indicate that there are differences in the perception of non-financial work values among respondents from different age intervals.
EN
Visualization in research process plays a crucial role. There are several advanced plots for visualizing categorical data, such as mosaic, association, double-decker, sieve or fourfold plot that are based on the graphical presentation of residuals in a contingency table. In this paper we present new methods for visualizing categorical data such as rmb, fluctile and scpcp plot available in extracat package in R. This package provides a well-structured representation of categorical data and allows for a detailed presentation of the relationship between categories in terms of proportions. We describe rmb, fluctile and cpcp. Those plots are based on the concept of multiple bar charts, a fluctuation diagram from a multidimensional table and parallel coordinates respectively. Such plots are mostly used for a visualization of a contingency table or a data frame; they can also be used for exploratory analysis and allows for a graphical presentation even for a high number of variables [Pilhöfer, Unwin 2013]. All the calculations and plots are obtained using R software.
EN
This paper focuses on hierarchical clustering of categorical data and compares two approaches which can be used for this task. The first one, an extremely common approach, is to perform a binary transformation of the categorical variables into sets of dummy variables and then use the similarity measures suited for binary data. These similarity measures are well examined, and they occur in both commercial and non-commercial software. However, a binary transformation can possibly cause a loss of information in the data or decrease the speed of the computations. The second approach uses similarity measures developed for the categorical data. But these measures are not so well examined as the binary ones and they are not implemented in commercial software. The comparison of these two approaches is performed on generated data sets with categorical variables and the evaluation is done using both the internal and the external evaluation criteria. The purpose of this paper is to show that the binary transformation is not necessary in the process of clustering categorical data since the second approach leads to at least comparably good clustering results as the first approach.
first rewind previous Page / 1 next fast forward last
JavaScript is turned off in your web browser. Turn it on to take full advantage of this site, then refresh the page.