NAGARAJAN
Experienced Data Scientist with a demonstrated history of working in the services industry. Skilled in Python, machine learning and Microsoft Excel.
TECHNICAL SKILLS:
Skills: Python, Excel, VBA.
ML Tools: SciKit, PySpark, TensorFlow, CUDA GPU.
OS Platforms : UBUNTU(linux), Windows.
Data Visualization: Matplotlib, seaborn.
IDE: Spyder, Jupyter.
ML & DL: Supervised and Unsupervised models, Classification (Na¨ıve Bayes, kNN, SVM, Logistic Regression, NN, Decision Trees), Regression, CNN
PROFESSIONAL EXPERIENCE:
May 2018 - Present Data Scientist
• Involved in the entire data science project life cycle and actively involved in all the phases including data extraction, data cleaning, statistical modeling and data visualization with large data sets.
• Developed a machine learning algorithm from scratch for multiclass text classification which went live and gave a consistent accuracy.
• Also worked on ICD classification which is one of the hardest NLP problems among healthcare industry.
• Worked on image processing to extract text from images using tesseract and opencv.
• Used Fuzzy logic and other NLP techniques to compare text and separate reports.
• Used AWS cloud services to handle large data and for building models.
• Coached, developed and motivated team interns, providing coaching and mentoring them. Wipro Limited, Chennai, TN, India September 2013 - May 2017 Analyst / Senior Officer
• Worked as a pricing analyst for in-store and e-commerce website.
• Visualized data insights and presented those to stakeholders and retail pricing managers.
• Monitored competitors’ pricing activities to make effective decisions that would improve company revenue.
• Analyzed customer’s review using NLP techniques and helped the stakeholders decide better pricing and offers.
• Used VBA macros to automate tasks
• Automated critical tasks which saved 8.5 hours a day
• Used market basket analysis to identify store-product relation.
PROJECTS:
Multilabel text classification
• Designed a classification model that specifies the number of labels which are present in a certain text. • Using sql packages in python, imported data from mysql to local environment.
• Cleaned the text using nlp techniques
• With the help of negex package, identified the feature based sentiment analysis and removed the negated symptoms which were present in the text which helped a lot in removing type I error
• Built Convolution neural network and achieved 72% accuracy
• With the help of confident score the reports which were low confident got to a new group and they were sent back for audit which resulted in purifying the data
• Once the data got purified retraining the with the updated data got us to 84 % accuracy Multiclass classification
• Built a classification model for clinical procedures.
• Using sql packages in python, imported data from mysql to local environment.
• Segregated all the text to their appropriate headings with the help of regex
• Removed unwanted heading which were pulled to the report
• Built Decision tree classification model and got 98% F1 score Table Extraction from IMAGE
• Extract tablular data from invoice.
• Cleaned the image with the help of opencv package.
• Identified contours and extracted the table using XY co-ordinates
• Currently trying to extract data from borderless table Twitter bot detection using Graph Analytics
• Designed a classification model that can predict and detect twitter bots using network analysis.
• Incorporated NetworkX, Graph2Vec libraries to analyze the network graph feature embedding.
Copyright© Cosette Network Private Limited All Rights Reserved