Asia Pacific University Library catalogue


SEMANTIC CUSTOMER FINDER : NLP AND MACHINE LEARNING APPROACH TO IDENTIFY CUSTOMERS / RAKESH JALLA.

By: RAKESH JALLA (TP045722)Contributor(s): Prof. Dr. Mandava Rajeswari [Supervisor.]Material type: TextTextPublication details: Kuala Lumpur : Asia Pacific University, 2019Description: ix, 85 pages : illustrations ; 30 cmSubject(s): Natural language processing (Computer science) | Semantic computing | Artificial intelligenceLOC classification: PM-32-17Online resources: Available in APres - Requires login to view full text. Dissertation note: A capstone project submitted in fulfilment of the requirements for the award of the degree of MSc. in Data Science And Business Analytics (UCMP1703DSBA). Summary: The current study explores the possibility of using the Twitter texts to be data modelled and be used as a tool to identify target markets for digital cameras as a product. This is executed with an assumption that a photographer can be a target market to the personnel marketing the digital cameras. Data from Twitter is used in the current study with various Natural Language Processing feature extraction techniques that include Bag Of Words, Term Frequency Inverse Document Frequency (TF-IDF), Word n-grams, POS tagging. The study narrowed down onto the best set of features that would be very efficient to model the data in classifying the tweets. Most common machine learning models used in NLP studies like Naïve Bayes, Support Vector Machines, Random Forest Classification and Artificial Neural Networks are experimented up the best set of features to identify a suitable model for the used dataset. An iterative procedure of feature engineering and dimensionality reduction by L1 regularization feature selection is performed with mentioned machine learning models and the study concludes that Random Forest Classification model performs best with an accuracy of 80% that is built upon the selected features that include TF-IDF, Word Bi-gram, POS tagging.
    Average rating: 0.0 (0 votes)
Item type Current library Collection Call number Copy number Status Notes Date due Barcode
Reference Reference APU Library
Reference Collection
Masters Theses PM-32-17 (Browse shelf (Opens below)) 1 Not for loan (Restricted access) Available in APres 00018476

A capstone project submitted in fulfilment of the requirements for the award of the degree of MSc. in Data Science And Business Analytics (UCMP1703DSBA).

The current study explores the possibility of using the Twitter texts to be data modelled and be used as a tool to identify target markets for digital cameras as a product. This is executed with an assumption that a photographer can be a target market to the personnel marketing the digital cameras. Data from Twitter is used in the current study with various Natural Language Processing feature extraction techniques that include Bag Of Words, Term Frequency Inverse Document Frequency (TF-IDF), Word n-grams, POS tagging. The study narrowed down onto the best set of features that would be very efficient to model the data in classifying the tweets. Most common machine learning models used in NLP studies like Naïve Bayes, Support Vector Machines, Random Forest Classification and Artificial Neural Networks are experimented up the best set of features to identify a suitable model for the used dataset. An iterative procedure of feature engineering and dimensionality reduction by L1 regularization feature selection is performed with mentioned machine learning models and the study concludes that Random Forest Classification model performs best with an accuracy of 80% that is built upon the selected features that include TF-IDF, Word Bi-gram, POS tagging.

There are no comments on this title.

to post a comment.