Full-text resources of CEJSH and other databases are now available in the new Library of Science.
Visit https://bibliotekanauki.pl

Refine search results

Journals help
Years help
Authors help

Results found: 44

first rewind previous Page / 3 next fast forward last

Search results

Search:
in the keywords:  data mining
help Sort By:

help Limit search:
first rewind previous Page / 3 next fast forward last
EN
The new original methods of statistical information processing are used in astrometry and space exploration for many years. It was turned out that these methods have universal character and can be successfully applied in various spheres and it was showed by testing these methods, which were conducted by the department of mathematical modeling of IUEH for 15 years. After testing all these methods were combined in the new “Nonclassical theory of errors measurement” (NTEM) published in 2015. The objective of research: To acquaint the specialists in the field of statistical information mathematical processing and analysis with the objects and opportunities of NTEM and its fundamental regulations because knowledge and usage of which are the most important in our time. As the result significance of the NTEM procedures in the complex of methods that make up the data mining. Methods: The statistical methods which demonstrate adequacy of the methods used by us in practice of observation are considered in the “Nonclassical Theory of Errors Measurement”. Conclusion: NTEM is the new, important and effective tool in the field of mining large amounts of statistical data, particularly in mathematical modeling, its diagnosis and processing of samples, the volume of which
EN
Success in the financial market reach those companies that having fast access to data can it properly used. In modern databases and data warehouse are collected vast amounts of information, which man himself is not able to quickly analyze. For this purpose are used the data mining methods that enable the discovery of new knowledge, that is, rules, patterns and relationships in large databases. The aim of this article is to present the data mining methods and their applications. Article is divided into two parts. In the first part of the article explains the concept of data mining and data mining methods are discussed and provides examples of their applications. In the second part of the article presents the companies selling on the Polish market commercial data mining software and examples od free open-source data mining software are discussed.
Zarządzanie i Finanse
|
2013
|
vol. 3
|
issue 1
103-115
EN
Company diagnostics is a vital element of its safe operation on the competitive market. Among the existing IT solutions, diagnostics is most often used in the Business Intelligence (BI) class systems designed for monitoring company business activities on a current basis (Business Activity Monitoring BAM). But solutions of this type reflect the current situation of the company without addressing the changes in its environment or any feedbacks existing in the company. The approach presented in the paper adopts a diagnostic method based on analysis of the dynamics of changes and interrelations between the relevant elements.
EN
The trend of broadband Internet expansion in conjunction with the increasing orientation of consumers towards buying via web shops, all combined with increased usage of e-banking services have contributed largely to the growth of online shopping trend. This paper deals with determining the influence of the chosen input variables (reading online magazines and newspapers, searching for product information online, using web TV, radio and e-banking services) on the observed target variable (online shopping, categorized by the level of its development in terms of individuals in European countries). The database was preloaded with data from EUROSTAT consisting of values for the abovementioned variables for 29 European countries in the period from 2007 to 2009. For the data mining process, the open source application Orange Canvas was used.
EN
In order to achieve better market position, companies need to develop customer-centric strategy and properly manage customer data at their disposal in order to obtain useful knowledge. However, conversion of customer data into customer knowledge is very challenging. Data mining methods and techniques search for hidden relationships and patterns in corporate databases, and herein lies their advantage in the process of generating the knowledge. The paper illustrates application of data mining techniques for improvement of marketing activities.
EN
This paper is aimed at using an ERP database to identify the variables that have a significant influence on the duration of a project phase. In the paper, some methodologies of the knowledge discovery process are compared and a model of knowledge discovery from an ERP database is proposed. The presented approach is dedicated for the industrial enterprises that use an ERP system to plan and control the development of new products. The example contains four stages of the knowledge discovery process, such as data selection, data transformation, data mining, and the interpretation of patterns. Among data mining techniques, a fuzzy neural system is chosen to seek relationships between data from completed projects and other data stored in an ERP system.
EN
This paper presents a data mining approach to forecasting exchange rates. It is assumed that exchange rates are determined by both fundamental and technical factors. The balance of fundamental and technical factors varies for each exchange rate and frequency. It is difficult for forecasters to establish the relative relevance of different kinds of factors given this mixture; therefore the utilization of data mining algorithms is advantageous. The approach applied uses a genetic algorithm and neural networks. Out-of-sample forecasting results are illustrated for five exchange rates on different frequencies and it is shown that data mining is able to produce forecasts that perform well.
PL
W artykule przedstawiono proces eksploracji danych statystycznych w prognozowaniu kursów walutowych. Zakładamy, że kursy walutowe pozostają pod wpływem zarówno czynników o charakterze fundamentalnym, jak i czynników pozaekonomicznych. Równowaga pomiędzy tymi czynnikami różni się w zależności od rodzaju kursu walutowego i częstotliwości jego pomiaru. Prognostykom trudno jest ustalić względną siłę wpływu różnych czynników, stąd analiza polegająca na eksploracji danych ma określone zalety. W proponowanym podejściu wykorzystano algorytmy genetyczne i sztuczne sieci neuronowe. Przedstawiliśmy wyniki eksperymentów prognostycznych poza próbą statystyczną w odniesieniu do pięciu kursów walutowych, obserwowanych z różną częstotliwością. Pokazaliśmy, że metoda eksploracji danych może stanowić skuteczne narzędzie prognostyczne.
EN
Nowadays, more and more enterprises are using Enterprise Resource Planning (EPR) systems that can also be used to plan and control the development of new products. In order to obtain a project schedule, certain parameters (e.g. duration) have to be specified in an ERP system. These parameters can be defined by the employees according to their knowledge, or can be estimated on the basis of data from previously completed projects. This paper investigates using an ERP database to identify those variables that have a significant influence on the duration of a project phase. In the paper, a model of knowledge discovery from an ERP database is proposed. The presented method contains four stages of the knowledge discovery process such as data selection, data transformation, data mining and interpretation of patterns in the context of new product development. Among data mining techniques, a fuzzy neural system is chosen to seek relationships on the basis of data from completed projects stored in an ERP system.
EN
Segmentation in banking for the business client market is traditionally based on size measured in terms of income and the number of employees, and on statistical clustering methods (e.g. hierarchical clustering, k-means). The goal of the paper is to demonstrate that self-organizing maps (SOM) effectively extend the pool of possible criteria for segmentation of the business client market with more relevant criteria, including behavioral, demographic, personal, operational, situational, and cross-selling products. In order to attain the goal of the paper, the dataset on business clients of several banks in Croatia, which, besides size, incorporates a number of different criteria, is analyzed using the SOM-Ward clustering algorithm of Viscovery SOMine software. The SOM-Ward algorithm extracted three segments that differ with respect to the attributes of foreign trade operations (import/export), annual income, origin of capital, important bank selection criteria, views on the loan selection and the industry. The analyzed segments can be used by banks for deciding on the direction of further marketing activities.
EN
The paper presents identifying success factors in new product development and selecting new product portfolio. The critical success factors are identified on the basis of an enterprise system, including the fields of project management, marketing and customer’s comments concerning the previous products. The model of measuring the success of a product includes the indicators such as duration and cost of product development, and net profit from a product. The proposed methodology is based on identification of the relationships between product success and project environment parameters with the use of artificial neural networks and fuzzy neural system that is compared with the results from linear model. The presented method contains the stages of knowledge discovery process such as data selection, data preprocessing, and data mining in the context of an enterprise resource planning system database. The illustrative example enhances a performance comparison of intelligent systems in the context of data preprocessing.
EN
In the paper, we propose a method for mining real-estate listings using clustering algorithms intended for numerical data. The presented approach is based on information systems over ontological graphs. Such information systems have been proposed to deal with data in the form of concepts linked by different semantic relations. A special attention is focused on preprocessing steps transforming advertisements in the textual form into information systems defined over ontological graphs, as well as on encoding attribute values for clustering algorithms.
PL
W artykule zaproponowano metodę eksploracji serwisów ogłoszeń nieruchomości przy użyciu algorytmów klasteryzacji przeznaczonych dla danych numerycznych. Przedstawione podejście bazuje na systemach informacyjnych nad grafami ontologicznymi. Systemy informacyjne tego typu zaproponowane zostały w celu poradzenia sobie z danymi w postaci pojęć powiązanych ze sobą za pomocą różnych relacji semantycznych. Specjalna uwaga została zwrócona na etap wstępnego przetwarzania danych z ogłoszeń w postaci tekstowej do postaci systemów informacyjnych zdefiniowanych nad grafami ontologicznymi jak również na kodowanie wartości atrybutów dla algorytmów klasteryzacji.
EN
The paper presents an exemplification of data mining techniques in aviation industry on the basis of Turkish Airlines. The purpose of the paper is to present application of data mining on the selected operational data, concerning international flight passenger baggage data, in year 2015. The differences in passenger and flight profiles have been examined. Firstly, two-steps approach allowed defining the number of clusters. Secondly, K-means clustering were applied to divide data into a certain number of clusters representing the different areas of consumption. Results can contribute to higher efficiency in decision making regarding destination offer and fleet management.
EN
The goal of the paper is to present the application of big data solutions in the process of organizations’ management especially concerning healthcare subjects. It raises the issue of big data application in multiple areas, including supporting decisions and the improvement of efficiency and efficacy of the whole decision-making process. Big data technologies have manifold advantages for the organizations which implemented it and may be an element which can contribute to the achievement of a competitive advantage of such an organization. The review paper presents the notion of big data solutions with a brief presentation of its architecture and also puts an emphasis on the benefits of its application in healthcare subjects and the management of organizations. It describes the methods and techniques of data processing for the purpose of huge volumes of data analysis. On the basis of the literature review and an analysis of the McKinsey report, the Big Data Executive Survey 2013 report, IBM, Intel research and case studies, it presents selected examples of big data application in healthcare.
EN
The existing approaches to the evaluation of on-line commercial services quality include various quality indicators. The application of multiple attributes for quality evaluation enables and involves specialized analysis techniques to carry it out. This article proposes to utilize association rules in the quality evaluation of the on-line services. The analysis carried out of the discovered association rules has provided interesting dependencies and relationships on the individual characteristics of the on-line commercial service. On the basis of such on analysis, conclusions can be made regarding the general quality of the on-line commercial services. The discovered dependencies and connections can be used in shaping online commercial service quality. Conclusions from the analysis of the association rules can therefore be used to improve the on-line commercial service quality comprehensively which can lead to the higher satisfaction of e-customers
EN
Data-driven decisions can be suboptimal when the data are distorted by fraudulent behaviour. Fraud is a common occurrence in finance or other related industries, where large datasets are handled and motivation for financial gain may be high. In order to detect and the prevent fraud, quantitative methods are used. Fraud, however, is also committed in other circumstances, e.g. during clinical trials. The article aims to verify which analytical fraud-detection methods used in finance may be adopted in the field of clinical trials. We systematically reviewed papers published over the last five years in two databases (Scopus and the Web of Science) in the field of economics, finance, management and business in general. We considered a broad scope of data mining techniques including artificial intelligence algorithms. As a result, 37 quantitative methods were identified with the potential of being fit for application in clinical trials. The methods were grouped into three categories: pre-processing techniques, supervised learning and unsupervised learning. Our findings may enhance the future use of fraud-detection methods in clinical trials.
PL
Celem głównym artykułu było przedstawienie wyników badania oceny dojrzałości procesu eksploracji danych na przykładzie polskich organizacji. Realizacji celu głównego przyporządkowano cele cząstkowe. CT1: Określenie istniejącego stanu wiedzy dotyczącego data-mining process w dyscyplinie nauk o zarządzaniu. Podjęta próba realizacji tego celu służyła identyfikacji luki poznawczej. CT2: Przyjęcie odpowiedniej perspektywy teoretycznej w postaci modelu teoretycznego, umożliwiającego realizację przyszłych wyzwań badawczych. W pierwszej sekcji artykułu opisano wyniki ilościowej i jakościowej analizy bibliometrycznej. Następnie, w sekcji drugiej przedstawiono parametry i definicję procesu eksploracji danych. W sekcji następnej przedstawiono model teoretyczny, wykorzystany do pomiaru dojrzałości procesu eksploracji danych. W sekcji czwartej, w wyniku zrealizowanego postępowania empirycznego scharakteryzowano strukturę badania oraz cząstkowe wyniki. W jego rezultacie stwierdzono, że zdecydowana większość badanych organizacji została zakwalifikowana do pierwszego poziomu dojrzałości procesu, definiowanego jako stan, w którym organizacje nie wykazują świadomości potrzeby identyfikacji działań zmierzających do eksploracji danych. Sformułowane w artykule cele badawcze zostały zrealizowane z wykorzystaniem takich metod badawczych, jak: ilościowa i jakościowa analiza bibliometryczna, sondażowe badanie opinii oraz metody statystyczne.
EN
The main goal of the article is to present the results of the study relating to the assessment of data mining process maturity on the example of Polish organizations. Several partial objectives were added to the main goal. CT1: To diagnose the current state of knowledge regarding the data-mining process in the discipline of management sciences. Attempts at attaining this objective served to identify the knowledge gap. CT2: To adopt an appropriate theoretical perspective in the form of a theoretical model, enabling the implementation of future research challenges. The first section of the article describes the results of quantitative and qualitative bibliometric analysis. The second section presents the parameters and the definition of the data mining process. Then, the theoretical model used for measuring the maturity of the data mining process is discussed. In the fourth section, the structure of the empirical research conducted and its partial results are outlined. It transpired that the vast majority of the surveyed organizations qualified at the first level of process maturity, defined as a state in which organizations are not aware of the need to identify activities aimed at data mining. Research objectives formulated in the article have been implemented using such research methods as quantitative and qualitative bibliometric analysis, opinion polls and statistical methods.
EN
The main goal of the article is to present the results of the study relating to the assessment of data mining process maturity on the example of Polish organizations. Several partial objectives were added to the main goal. CT1: To diagnose the current state of knowledge regarding the data-mining process in the discipline of management sciences. Attempts at attaining this objective served to identify the knowledge gap. CT2: To adopt an appropriate theoretical perspective in the form of a theoretical model, enabling the implementation of future research challenges. The first section of the article describes the results of quantitative and qualitative bibliometric analysis. The second section presents the parameters and the definition of the data mining process. Then, the theoretical model used for measuring the maturity of the data mining process is discussed. In the fourth section, the structure of the empirical research conducted and its partial results are outlined. It transpired that the vast majority of the surveyed organizations qualified at the first level of process maturity, defined as a state in which organizations are not aware of the need to identify activities aimed at data mining. Research objectives formulated in the article have been implemented using such research methods as quantitative and qualitative bibliometric analysis, opinion polls and statistical methods.
PL
Celem głównym artykułu było przedstawienie wyników badania oceny dojrzałości procesu eksploracji danych na przykładzie polskich organizacji. Realizacji celu głównego przyporządkowano cele cząstkowe. CT1: Określenie istniejącego stanu wiedzy dotyczącego data-mining process w dyscyplinie nauk o zarządzaniu. Podjęta próba realizacji tego celu służyła identyfikacji luki poznawczej. CT2: Przyjęcie odpowiedniej perspektywy teoretycznej w postaci modelu teoretycznego, umożliwiającego realizację przyszłych wyzwań badawczych. W pierwszej sekcji artykułu opisano wyniki ilościowej i jakościowej analizy bibliometrycznej. Następnie, w sekcji drugiej przedstawiono parametry i definicję procesu eksploracji danych. W sekcji następnej przedstawiono model teoretyczny, wykorzystany do pomiaru dojrzałości procesu eksploracji danych. W sekcji czwartej, w wyniku zrealizowanego postępowania empirycznego scharakteryzowano strukturę badania oraz cząstkowe wyniki. W jego rezultacie stwierdzono, że zdecydowana większość badanych organizacji została zakwalifikowana do pierwszego poziomu dojrzałości procesu, definiowanego jako stan, w którym organizacje nie wykazują świadomości potrzeby identyfikacji działań zmierzających do eksploracji danych. Sformułowane w artykule cele badawcze zostały zrealizowane z wykorzystaniem takich metod badawczych, jak: ilościowa i jakościowa analiza bibliometryczna, sondażowe badanie opinii oraz metody statystyczne.
EN
The article is dedicated to the problematic of using new technologies at high schools and during the preparation of future teachers. It is focused on using thermometric measurements in various fields of technics. The main emphasis is placed on forming the measuring methodology and the data mining application. A special set for thermometric measurements is developed for the didactic application. The set respects the newest teaching methodologies.
PL
W ostatnich latach pojawiły się metody symbolicznego reprezentowania szeregów czasowych. Te badania są zasadniczo motywowane względami praktycznymi, takimi jak oszczędzanie pamięci lub szybkie przeszukiwanie baz danych. Niektóre wyniki w temacie symbolicznego reprezentowania szeregów czasowych sugerują, że zapis skrócony może nawet poprawić wyniki grupowania. Artykuł zawiera propozycję nowego algorytmu ukierunkowanego na zagadnienie skróconej symbolicznej reprezentacji szeregów czasowych, a w szczególności na efektywne grupowanie szeregów. Idea propozycji polega na wykorzystaniu techniki PAA (piecewise aggregate approximation) z następną analizą korelacji otrzymanych segmentów szeregu. Podstawowym celem artykułu jest modyfikacja techniki PAA ukierunkowana na możliwość dalszego grupowania szeregów w ich skróconym zapisie. Próbowano również znaleźć odpowiedzi na następujące pytania: „Czy zadanie grupowania szeregów czasowych w ich oryginalnej postaci ma sens?”, „Ile pamięci można oszczędzić, stosując nowy algorytm?”. Efektywność nowego algorytmu została zbadana na empirycznych zbiorach danych szeregów czasowych. Wyniki pokazują, że nowa propozycja jest dość efektywna przy bardzo nikłym stopniu parametryzacji wymaganym od użytkownika.
EN
In recent years a couple of methods aimed at time series symbolic representation have been introduced or developed. This activity is mainly justified by practical considerations such memory savings or fast data base searching. However, some results suggest that in the subject of time series clustering symbolic representation can even upgrade the results of clustering. The article contains a proposal of a new algorithm directed at the task of time series abridged symbolic representation with the emphasis on efficient time series clustering. The idea of the proposal is based on the PAA (piecewise aggregate approximation) technique followed by segmentwise correlation analysis. The primary goal of the article is to upgrade the quality of the PAA technique with respect to possible time series clustering (its speed and quality). We also tried to answer the following questions. Is the task of time series clustering in their original form reasonable? How much memory can we save using the new algorithm? The efficiency of the new algorithm was investigated on empirical time series data sets. The results prove that the new proposal is quite effective with a very limited amount of parametric user interference needed. 
PL
Niniejsza publikacja stanowi próbę scharakteryzowania deterministycznych czynników wpływających na wygraną w pokera. Przeprowadzono analizę w oparciu o jedną z metod eksploracji danych – drzewa klasyfikacyjne. Wybór tej techniki podyktowany był wykorzystaniem danych jakościowych jako zmiennych objaśniających rozgrywkę pokerową oraz prostotą prezentacji otrzymanych wyników, nawet przy bardzo rozbudowanych drzewach. W badaniu odkryto kilka czynników, które w istotny sposób mają wpływ na przebieg gry.
EN
The paper aims to characterize key factors determining poker game outcome. The analysis was based on classification trees and this was due to the qualitative data used as the explanatory variables. The method enables clear presentation of the results even in case of very complex tree structures. The study describes also a few other factors that significantly influence the game outcome.
first rewind previous Page / 3 next fast forward last
JavaScript is turned off in your web browser. Turn it on to take full advantage of this site, then refresh the page.