Skip to main content

My skills

Python



I have over eight years of experience with Python, out of which over four years of industry experience. I have experience working with the following packages and frameworks:

  • Web scraping: requests, requests-html, selenium
  • Data extraction from PDFs: textract, pdfminer, camelot
  • Data analysis: numpy, pandas
  • Data visualization: matplotlib
  • Connection between Python and SQL: pyodbc, SQLAlchemy
  • Regression analysis: statsmodels, linearmodels
  • Machine learning: tensorflow (CPU, GPU), scikit-learn
  • Gen AI: transformers, llama-cpp-python, whisper, gradio

SQL Server


I have over four years of experience with SQL Server as my work as a Data Analyst at Rystad Energy primarily involved working with the database. I have experience working with the following:

  • Writing queries of any complexity
  • Importing data from .csv files and MS Access databases; some experience of working with SSIS packages to automate data import
  • Developing stored procedures for data transformations, cleaning and deduplicating the raw data, and data modeling
  • Optimizing existing stored procedures written in a non-optimal way (splitting large update queries into smaller ones, adding and removing indexes of tables where required)
  • Using SQL Server Agent for scheduling both SQL stored procedures, Python scripts, and CMD /PowerShell scripts
  • Connecting SQL Server to PowerBI dashboards
  • Implementing change management of SQL Server stored procedures using: 
    • Azure Data Studio with SQL Database Project extension - for rebuilding database project after making changes
    • BitBucket git repository - for version control
    • Azure DevOps for managing CI/CD pipeline to deploy the changes to both development and production server

R


I have over four years of experience with R, all of which is academic. I have used R for the following (with package names used):

  • Data description and data manipulation (dplyr)
  • Simple linear and generalized linear models (base R)
  • Time series analysis with ARMA and ARIMA models (forecast)
  • Panel data regression models, with panel-corrected standard errors and presence of heteroskedasticity (panelAR); Github
  • Two-staged Heckman model, correcting for self-selection bias (sampleSelection); Github
  • Data visualization (ggplot2)
  • Output the results of regression models to Latex (stargazer)

GenAI


I have got some experience with Machine Learning and generative AI in recent years, some of which is from my industry job and other is self-taught. I have some experience with the following:

  • Classifying low-resolution satellite imagery of oil and gas well pads to identify the time periods corresponding to drilling and fracking (tensorflow)
  • Using whisper for local speeh recognition; Github
  • Using small, local AI models for translating text; Github
  • Using local text-to-speech models; Github
  • Running, smaller, local LLMs such as Llama or Gemma family using llama.cpp or koboldcpp