Browsing by Author "Puspitarani, Yan"
Now showing 1 - 4 of 4
Results Per Page
Sort Options
- ItemFILTER-BASED FEATURE SELECTION PADA KATEGORISASI ARTIKEL BERITA BERBAHASA INDONESIA(Seminar Teknik Informatika dan Sistem Informasi 2013,Universitas Kristen Maranatha, 2013-04-06) Puspitarani, YanWith the technology development, a large amount of information such as news articles are available over the internet. Hence, text categorization, such as applying classification as one of data mining task, is needed. The major issue in text categorization is the high dimensionality of data. Therefore, we need to select some representative attributes to improve the performance of text categorization. One of techniques to complete this task is feature selection. Feature selection can reduce high dimensionality. Thus, the classifier effectiveness can improve. Among many method, is a filterbased feature selection. This research examined and compared some feature selection techniques toward Indonesian news articles by applying filter model. These models are discussed: Gini Index for text categorization, CHI, Information Gain, Expected Cross Entropy, Weight Of Evidence and Orthogonal Centroid Feature Selection (OCFS).
- ItemJOB SELECTION OF THE INFRASTRUCTURE SECTION IN FOUNDATION X WITH C4.5 ALGORITHM(International Journal of Psychosocial Rehabilitation, Vol.24, Issue 02, 2020) Sunjana; Puspitarani, YanAn institution generally requires a draft job that costs must be prepared well in advance. In relation to the effectiveness of time and budget efficiency, a mechanism is needed for ease in determining criteria or feasibility of a royalty to be included in the Budget Plan. This needs to be prepared every fiscal year, not least as the X Foundation's Infrastructure Section and Infrastructure does. To obtain the ease of decision-making with respect to the feasibility of the type of work to be budgeted within the RAB, the Infrastructure Section shall make predictions and classifications of the various types of work for which data has been recorded. For this purpose, Algorithm C4.5 can be utilized in conducting clustering process or work clasification based on information / data available. With the help of the C4.5 Algorithm the classification process can produce a Decision Tree associated with the type of work that is feasible to prepare in the RAB in the coming year. Decision-making process is done by processing existing data, and it is important for institutional sustainability process in moving company's business process. With the results that have been obtained, Section of Facilities & Infrastructure will be easier in doing the cost draft which subsequently submitted to RAB Yayasan X University.
- ItemPEMANFAATAN CLUSTERING DALAM PENCARIAN KEMIRIPAN DOKUMEN PAPER CONFERENCE(Konferensi Nasional Informatika,Sekolah Teknik Elektro dan Informatika ITB, 2013-11-28) Puspitarani, YanBanyaknya penyimpanan informasi di Internet . . . . sangat membantu para penulis dalam menghasilkan karya tulis ilmiah. Penulisan karya ilmiah ini biasa dimanfaatkan kalangan akademik dalam kegiatan paper conference atau sebagai tugas kuliah bagi mahasiswa. Hal ini membuat pemeriksa kesulitan dalam memeriksa keunikan karya tulis yang dihasilkan. Pencarian kemiripan dokumen menjadi salah satu solusi yang dapat digunakan. Sehubungan dengan ha1 tersebut, proses clustering dalam text mining dapat dimanfaatkan untuk pencarian kemiripan dokumen agar lebih efektif. Pada penelitian ini, dibuktikan dua buah hipotesis dalam pencarian kemiripan dokumen dan menghasilkan solusi pemanfaatan pencarian kemiripan dokumen-dokumen berbahasa Indonesia. Selain itu, akan dibuktikan pula hasil K-Means clustering dengan pemilihan feature terhadap isi dokumen berdasarkan judul, abstrak, pendahuluan, penutup, dan daftar pustaka, dapat lebih baik dibandingkan dengan hasil clustering biasa. Prototipe aplikasi pun dibangun untuk membuktikan hipotesis tersebut Hasil pengujian pada penelitian ini menunjukkan bahwa pemilihan feature untuk clustering menghasilkan/akurasi yang paling tinggi, yaitu mencapai nilai 0.96. Selain itu, dibuktikan pula gap perhitungan waktu pencarian yang cukup besar antara pencarian terhadap dokumen ter-cluster dengan dokumen tanpa cluster.
- ItemPREPARATORY DOCUMENT STRUCTURING TECHNIQUE(International Journal of Psychosocial Rehabilitation, Vol.24, Issue 02, 2020) Puspitarani, Yan; Zulpratita, Ulil SurtiaThe need for mining structured data has increased in the past few years. This structured data is used as input for data mining tasks. Text mining is part of data mining where the data used is in the form of unstructured text. Text mining can able to handle unstructured or semi-structured data sets such as emails HTML files and full text documents etc. The unstructured data usually refers to information that does not reside in a traditional row-column database and it is the opposite of structured data. In order to extract information from text, preprocessing steps are needed. This paper discussed about the theoretical basis of preprocessing document for Text Mining. Brief descriptions of some representative approaches such as NLP tasks and Information extraction are provided as well.