Full-text resources of CEJSH and other databases are now available in the new Library of Science.
Visit https://bibliotekanauki.pl

Results found: 7

first rewind previous Page / 1 next fast forward last

Search results

Search:
in the keywords:  random forest
help Sort By:

help Limit search:
first rewind previous Page / 1 next fast forward last
EN
The paper presents an application of interpretative machine learning to identify groups of lakes not with similar features but with similar potential factors influencing the content of total phosphorus – Ptot. The method was developed on a sample of 60 lakes from North-Eastern Poland and used 25 external explanatory variables. Selected variables are stable over a long time, first group includes morphometric parameters of lakes and the second group en- compass watershed geometry geology and land use. Our method involves building a regression model, creating an ex- plainer, finding a set of mapping functions describing how each variable influences the outcome, and finally clustering objects by ’the influence’. The influence is a non-linear and non-parametric transformation of the explanatory variables into a form describing a given variable impact on the modeled feature. Such a transformation makes group data on the functional relations between the explanatory variables and the explained variable possible. The study reveals that there are five clusters where the concentration of Ptot is shaped similarly. We compared our method with other numerical analyses and showed that it provides new information on the catchment area and lake trophy relationship.
|
2018
|
vol. 40
73-82
EN
Predictive business process monitoring is a current research area which purpose is to predict the outcome of a whole process (or an element of a process i.e. a single event or task) based on available data. In the article we explore the possibility of use of the machine learning classification algorithms based on trees (CART, C5.0, random forest and extreme gradient boosting) in order to anticipate the result of a process. We test the application of these algorithms on real world event-log data and compare it with the known approaches. Our results show that.
PL
Celem artykułu jest opracowanie koncepcji zapełnienia ekspozycji sklepowych jako sys- temu oraz ocena jakości modeli prognozowania popytu (które w Polsce nie są jeszcze wykorzystywane przez sieci handlowe) bardzo wolno rotujących produktów jako jego kluczowego podsystemu. Jakość modeli oceniono za pomocą miary Weighted Mean Absolute Percentage Error na różnych poziomach szczegółowości: dla całej sieci sprzedaży i określonego miesiąca oraz na „na przecięciu” sklepu, produk- tu i rozmiaru produktu. Najpierw zbudowano pojedyncze modele, następnie zaś odrębne modele dla sklepów stacjonarnych i internetowych, jak również marek, tworząc zespół sześciu modeli. Poprawę dopasowania modeli osiągnięto tylko dla sklepów internetowych. Wyniki pracy wskazują, że podejście klasyfikacyjne dla bardzo wolno rotujących produktów charakteryzują równie precyzyjne wyniki pro- gnoz jak podejście regresyjne. Nie można wskazać jednego modelu lub zespołu modeli (zbudowanego określoną metodą uczenia maszynowego), który wykonał najlepsze prognozy popytu dla sklepów sta- cjonarnych, gdyż istotności różnic median prognoz na ogół nie potwierdzono testami statystycznymi.
EN
The aim of the paper was to develop the concept of retail display space allocation as a system and to assess the quality of very slow-moving products demand forecasting models (that have not yet been used by retail companies in Poland) as its key subsystem. Forecasts were made using the example of a clothing company. The quality of these models was assessed using the Weighted Mean Absolute Percentage Error. The first step was to build the individual models. Later, the authors built separate models for brick-and-mortar and online stores as well as brands, creating a set of six models. The findings show that the classification approach for very slow movers provides as precise results as the regression approach. No single model or set of models (built with a particular machine learning method) could be identified that made the best demand forecasts for brick-and-mortar stores, as statistical tests generally did not confirm the significance of the differences between the median forecasts.
EN
Research background: Even though unintentional accounting errors leading to financial restatements look like less serious distortion of publicly available information, it has been shown that financial restatements impacts on financial markets are similar to intentional fraudulent activities. Unintentional accounting errors leading to financial restatements then affect value of company shares in the short run which negatively impacts all shareholders. Purpose of the article: The aim of this manuscript is to predict unintentional accounting errors leading to financial restatements based on information from financial statements of companies. The manuscript analysis if financial statements include sufficient information which would allow detection of unintentional accounting errors. Methods: Method of classification and regression trees (decision tree) and random forest have been used in this manuscript to fulfill the aim of this manuscript. Data sample has consisted of 400 items from financial statements of 80 selected international companies. The results of developed prediction models have been compared and explained based on their accuracy, sensitivity, specificity, precision and F1 score. Statistical relationship among variables has been tested by correlation analysis. Differences between the group of companies with and without unintentional accounting error have been tested by means of Kruskal-Wallis test. Differences among the models have been tested by Levene and T-tests. Findings & value added: The results of the analysis have provided evidence that it is possible to detect unintentional accounting errors with high levels of accuracy based on financial ratios (rather than the Beneish variables) and by application of random forest method (rather than classification and regression tree method).
PL
Celem artykułu była identyfikacja zależności między przestępczością a wybranymi charakterystykami powiatów w 2014 roku z wykorzystaniem drzew regresyjnych. Do wygenerowania drzewa wykorzystana została nieobciążona metoda rekurencyjnego podziału. W trakcie kolejnych podziałów przestrzeni zmiennych istotne okazały się następujące czynniki objaśniające natężenie przestępstw stwierdzonych ogółem: wskaźnik urbanizacji, odsetek gospodarstw jednoosobowych, natężenie przestępstw stwierdzonych w powiatach sąsiednich, współczynnik rozwodów oraz udzielone noclegi w przeliczeniu na 1000 ludności. Do identyfikacji zależności między wybranymi charakterystykami obszarów a przestępczością wykorzystano również las losowy zbudowany z wielu drzew regresyjnych. Uzyskane dla lasów losowych rankingi ważności predyktorów ujawniły szczególnie silny związek między przestępczością a urbanizacją.
XX
The aim of this article is to identify relationships between crime rate and some socio-economic, demographic and environmental factors in the poviats of Poland. There were analysed cross – sectional data using regression tree. The following factors were found to significantly explain the intensity of crime rate: urbanisation, percentage of single-person households, provided accommodation per 1000 population, divorce’s coefficient and the intensity of crime in the neighboring areas. Then the random forest was used to improve prediction’s accuracy and generate rank of variable importance.
first rewind previous Page / 1 next fast forward last
JavaScript is turned off in your web browser. Turn it on to take full advantage of this site, then refresh the page.