• Hello!
    I'm Vedant Dave

    Experienced Data Analyst, aspirant Machine Learning/ Deep Learning Engineer.

  • Career Goal

    "My primary goal is to gain new expertise and experience by working with the state of technology in a progressive organization, which can satisfy my entrepreneurial mindset. My ultimate goal is to become an expert in the field by leading projects and researches to improve business and human values"

    Currently, I am working as Freelancer on UpWork, Let's make something togather!

    View UpWork Profile

About Me

Who Am I?

Hello, Nice to see you. I am Vedant Dave, a detailed oriented, visionary data science graduate currently doing freelancing at UpWork. I love analyzing practical problems and discovering their solutions using machine learning. I have a deep interest in AI and want to work with machine learning applications research, development and operations. I have a nano degree in “Deep Learning Engineer” and also have a master’s degree in data science. I’d love to combine my previous work experience in data analytics, Machine learning and engineering with “the state of technology” to continue solving practical world problems for improving business & human values.

I have working experience in data analytics & reporting, ML/DL model developing tasks and database management. I also worked on a special project of computer vision “tablet crack detection for pharma” which includes model development, accuracy improvement and model deployment using cloud technology with proper production environment including CI/CD pipeline and microservices.

I also worked on more than 10+ projects related to banking, business intelligence, marketing and energy datasets. Currently, I am also working on a Kaggle competition “HubMAP: Hacking the Kidney”,a research-based project on my self-interest. I also like to read research papers and ambitious to convert them into code.

My Area of Interest

AI Research and Development

ML Operations (MLOps)

Big Data Architecture

Data Analytics & Reporting

What I do?

Here are my area of expertise

Deep Learning

Have experience of paper implementation with PyTorch Keras - Tensorflow. (RNN/RNN/GAN)

Machine Learning

Knowledge of Modern ML algorithms, implementation with scikitlearn, NLTK. Know Pandas, Matplotlib & Numpy.

API dev. & Deployment

Experience of Production Workflow with CI/CD pipeline, RESTfulAPI and Flask framework.

Cloud Computing

Worked with AWS (EC2, sagemaker, lambda, EMR, elasticbean & APIs. familiar with AZURE & GCP.

Data Analytics

Worked on customer, sales, production & Machine data. Have extended knowledge of python, Tableau & Qlik.

Microservices

Have Experience with scaled service architecture, App. dockerization & Kubernetes (EKS, GKS) .

Big Data Developement

Have project experience with spark(scala), Hadoop ecosystem, MapReduce Job and Hive Analytics

Git Contributions
Projects
Blogs
Research paper Reading
My Specialty

My State of Skills

  • Python: Data Structure, Algorithms, Data Wrangling, Data Preprocessing, Analytics, Visualization, ML/DL modeling
  • Machine Learning : ML Libraries [Scikit learn, NLTK, Pandas, Numpy, Scipy, Jupyter Notebook(Colab)]
  • Deep Learning : ANN, CNN, RNN, GAN , DL Libraries [ Pytorch, TensorFlow, OpenCV]
  • Database: SQL, PostgreSQL, MongoDB
  • Analytics & Visualization Tools: Tableau- 10, Orange, Power BI, and Excel VBA
  • Big Data Technology: Hadoop Ecosystem, Spark, HDFS, HBase, Hive
  • Cloud Computing: AWS (EC2, SageMaker, DeepLans Lambda, EMR, S3) , (exposure) GCP, Microsoft Azure
  • Engineering Experience: Engineering Automation Designing, Energy- Machine data Analytics, Energy conservation, Non-conventional energy,
  • Miscellaneous skills: Linux, Shell, Git, Gitflow, JIRA, Confluence, Technical Writing.
  • Python

    80%

    SQL

    80%

    Java

    75%

    MATLAB

    60%

    Scikitlearn

    80%

    PyTorch

    80%

    Tensorflow

    80%

    Spark

    65%

    Flask

    80%

    RESTfulAPI

    70%

    Amazon Web Services

    80%

    Tableau analytics & dashboard

    90%
    Education

    Education

    Won Bertelsmann Scholarship and successfully completed the Udacity course work related to Deep learning model architectures (CNN, RNN, GAN/DCGAN).

    Develop high accuracy projects such as Dog Breed Classifications, Fake TV Script Generation, Celebrity Face Generation and AWS deployment of Sentiment Prediction Application.

    Gained and apply new skills related to data science via projects, paper discussion and presentations throughout the academic year. Major Course Work :

    • Applied Machine Learning
    • Deep Learning
    • Data Mining
    • Computational Intelligence
    • Project Management

    Gained knowledge of Electrical and Electronic technology, power system management, its distribution, Energy data analytics, Micro processor, Microcontroller and networking intrumentations. Worked on "Solar Power cell efficiency improvement" and "GPS based speed control of Electric Motor using PWM wave" projects as a part of internship.

    Self Learning

    Certifications

    Machine Learning / Cloud / Microservices

    • Deep Learning specialization (deeplearning.ai)
    • Python for DataScience & Machine Learning Bootcamp
    • IBM Data Science Professional Certification
    • OpenCV for Python Developer
    • Google Cloud Plateform : Essential Training
    • NLP with Python for Machine Learning
    • RESTful API with Flask and Python
    • Microservices : Docker and Kubernetes (EKS/GKS)
    • Machine Learning : Stanford (MATLAB)
    • Learning GO

    Self Learning Courses and Certifications

  • Tableau Master Class for Business Intelligence
  • Advanced SQL for Data Scientist
  • Learning Linux Command Line
  • Big Data Modeling and Management System
  • Apache Spark with SCALA
  • HIVE for Big data Analytics
  • Experience

    Work Experience

    Data Analyst 2017-2018

    Worked with the Data Science team to generate python logic functions, efficient SQL queries for improving business operations. Performed daily responsibilities of row data gathering, cleaning, analytics & visualization, Tableau/BI dashboard reporting. Worked on Machine Learning, Computer vision, and big data technology assignments for improving the decision score matrix. Handled Adhoc responsibility of QA data quality, integrity, and gap analytic tasks.

    Consulting Engineer (IT/ Software - Projects) 2016-2017

    Served as project management team member responsible for project requirements, business needs, process alternation, and cost estimation. Worked on SCM system database creation, data cleaning, preprocessing, integration tasks. Handled material tracking system designing, project progress analysis reports, and dashboard creation.

    My Project Portfolio

    Recent Work

    Machine Learning Application Deployment using AWS [Sentiment Analysis]

  • Use AWS services such as Sagemaker, S3, lambda, and API gateway for deploying ML Application.
  • Improve result by applying random forest algorithm, its hyperparameter tuning and model update achieved result, and get higher accuracy 83%
  • Review the online critic’s sentiment on the movie and classify it with 100% accuracy.
  • Machine Learning Application Deployment using AWS

    Dog Breed Image Classification Algorithm for Application

  • Develop Algorithm for Dog_Breed-Application with satisfying accuracy. The algorithm can handle the difference between human and dog faces.
  • Test two different approaches to determining the specific technique. [1] Custom CNN. [2] Transfer learning.
  • Use pie visualization, for final output representation. Successfully generate data flow for “run_app function” with support functions.
  • Dog Breed Image Classification Algorithm for Application

    TV SCRIPT GENERATION (RNN – LSTM)

  • Use LSTM cell in RNN to generate fake TV scripts from the Seinfeld TV series (9 seasons).
  • Tokenize the data and created look up table for RNN input. Create Custom data loader with batch wise data distribution method.
  • Defined RNN architecture using Research Paper and trained with hyper parameter tuning. Get competitive acceptance performance on model generated output scripts (fake scripts).
  • TV SCRIPT GENERATION (RNN – LSTM)

    GAN based Celebrity Face Generation

  • Defined and train DCGAN algorithm for generating faces from open source celebrity dataset.
  • Visualize trained Generator to see how it performed; my generated samples looked like fairly realistic faces with small amounts of noise.
  • Use Hyper parameter tuning to adjust the generator and classifier loss. Reduce loss by 80 percent in five Tuning
  • GAN Based Celebrity Face Generation.

    INSTACART MARKET ANALYSIS

  • Resolve Business Solution for Instakart by forecasting purchasing patterns, departmental performance, and future human resource requirements based on a Company portfolio of 200,000 active customers and 200,000 orders.
  • INSTACART MARKET and Stock requirement Analysis

    FINANCIAL FRAUD ANALYTICS & ML-MODELING

  • Analyzed 6362620 Transaction data of Financial institute using Exploratory analytics, feature engineering and Ensemble Machine Learning experiment to classify fraudulent data from a genuine one.
  • Successfully, identified the major mode of transaction responsible for fraud using feature importance matrix.
  • Financial Fraud Analytics & ML (LGB) Modeling

    Researching:- Effect of feature selection on Machine learning performance [accuracy, compexity [time, computation]

  • FOCUS AREA
  • Data Reduction Techniques : UDFS, LLCFS, CFS
  • Applied ML Models : SVM, KNN, Decision Tree
  • Performance Matrix : 10k fold validation test, Accuracy , Precision, Recall, Time complexity, Computation Complexity.
  • Researching:- Effect of feature selection on ML Performance

    UNSUPERVISED NEURAL-NET APPRAOCH TO SURVEY MISSING DATA IMPUTATION - Self Organizing Map

  • Team Project based on a research paper of survey missing data imputation method is used for 16 Numerical, categorical and Mixed Datasets.
  • With the help of the NRMS and AE evaluation method, we filled an evaluation table, based on data imputation by identify (learning rate & sigma) as parameters.
  • By optimized both of them we got all our results as per given academic standards. (AE >1 and NRMS < 0.1(for 1% and 5% error rate)
  • SURVEY MISSING DATA IMPUTATION - Self Organizing Map

    Banking Loan Pending amount Calculations & IR distribution Prediction (Power BI)

  • Calculated IR distribution, pending and paid loan amount, remaining durations, the monthly interest rate for two durations (15 and 30 years).
  • Used DAX (data analysis expressions) for PowerBI calculations.
  • Generated bar graph for customer loan interest rate and total amount distribution over the duration.
  • Banking Loan Pending amount Calculations & IR distribution Prediction (Power BI)n

    CANADA BANKRUPTCY ANALYSIS (TABLEAU - 10)

  • Created Tableau dashboard of Canada's bankruptcy for (year -2014) from open source Canada's Census & Bankruptcy dataset.
  • Province wise analysis for different industry sectors, partitioning results in age groups and gender.
  • Canada Bankruptcy Analysis (Tableau dashbaord)

    My Knowledge Sharing

    Recent Blog

    "When you’ve written the same code 3 times, write a function,
    When you’ve given the same in-person advice 3 times, write a blog post"

    - David Robinson

    HTML5 Bootstrap Template by colorlib.com
    May 7, 2020 | Sentiment Analytic Approaches | 2

    Neural Network for Sentimental Analysis [Part -1: Feature Extraction]

    Blog about Sentiment Analysis, differnet analytic approaches, Neural Network Approach in depth and Text Feature Extraction.

    HTML5 Bootstrap Template by colorlib.com
    May 7, 2020 | Neural Network Architecture | 2

    Neural Network for Sentimental Analysis [Part -2: Neural_Net Architecture & tuning]

    Explaination of Google Colab Code for data preprocessing, Network Architecure, Training & Testing Logic, Learning Rate adjustment.

    HTML5 Bootstrap Template by colorlib.com
    May 8, 2020 | Hypothesis Testing & Performance Improvement | 2

    Neural Network for Sentimental Analysis [Part -3: Noise Reduction Hypothesis Implementations]

    Applied Hypothesis testing for Noisy data elimination, complexity reduction and polarity cutoff and ML Model Improvment

    Let's Make Something Togather!

    Contact Me