Samuel lemma

samuellemma700@gmail.com

DATA SCIENTIST | DATA ENGINEER

Big data wrangler

Samuel

• MS in Data Science graduate leveraging academic and hands-on expertise in data mining/modeling/analytics, machine learning algorithms, and big data technologies (including Hadoop and Spark) to design robust analytical and predictive models that generate business-critical insights and facilitate strategic decision-making.

• Extensive experience in writing and running Python/R code and SQL queries, retrieving data from complex data sets, and performing detailed quantitative and qualitative analyses using various statistical techniques to support strategy development and operational optimization in close collaboration with technical and business stakeholders.



DATA SCIENCE PROJECTS

Gene Region Segmentation using U-Net with Attention Mechanism

• Spearheaded the design and implementation of a semantic segmentation model in TensorFlow and Keras that utilized U-Net architecture with an attention mechanism and residual blocks to successfully identify chromosomes in 13.4K+ images. • Developed the model and leveraged multiple data pre-processing techniques such as one-hot encoding, normalization, and cropping to optimize model performance.

Skills: Deep learning, TensorFlow, Keras, U-Net, attention mechanism, residual blocks, data pre-processing, Python

End-to-End Big Data Solution

Recommendation System • Built a scalable, fault-tolerant recommendation system that enabled a retail firm to conduct daily analysis of 1 TB of structured and unstructured data from multiple cities while successfully handling increasing data volumes with consistent performance.

Skills: Big data, Hadoop, Spark, data analysis, structured and unstructured data, HDFS

College Enquiry Chat Bot

• Designed an AI-powered chatbot capable of successfully answering simultaneous student queries across multiple colleges with relevant information while integrating various features to enhance user experience.

Skills: Natural language processing, user experience design, multi-user systems

Superscript

College Enquiry Chat Bot

This is an image presentation for the ChatBot. 
Superscript

Data warehouse and Data streaming with Apache Spark and Apache Hadoop

1. providing the data warehousing capability through HDFS and Hive components with different functionality like storing, managing, analyzing, indexing, partitioning, and more to the data.
2. using streaming tools like Apache Storm and some other streaming components to provide streaming capability.

Skills

  • Programming languages

    python, Java, R, Scala, MATLAB

  • Data Visualization and Reporting

    SSRS, PowerBI, JUPYTER, Tableau

  • Machine Learning and Big Data 

    Neural Networks, Convolutional Neural Networks, Recurrent Neural Networks, Natural Language Processing
    Big Data Technologies: Hadoop, Spark, Hive, Flume, Kafka

  • Database Programming

    SQL Databases: MySQL, PostgreSQL, Oracle, MSSQL NoSQL Databases: MongoDB, Cassandra, Neo4j

  • Web Development

    Django, Flask, HTML, CSS

LETS CONNECT

You can simply book an appointment by pressing the button.


Got any questions?

Please feel free to email me at any time, thank you for your time.
Name E-mail Message Submit