Ilya Prokin,法国波尔多的开发者
Ilya is available for hire
Hire Ilya

Ilya Prokin

Verified Expert  in Engineering

Data Science Developer

Location
Bordeaux, France
Toptal Member Since
September 7, 2022

Ilya is a researcher (Ph.D.), data scientist, CTO, 以及在制造业应用数据科学和机器学习方面拥有专业知识的企业家, finance, and biotech. 他发表了五篇科学论文, 改进股票市场波动预测, developed MVPs, pitched startups, 并建立了一个强大的数据科学社区来讨论最先进的DS主题. Ilya喜欢用数据来改善业务, 开发应用数据科学的创新方法, 以及对优化的探索.

Portfolio

贷款快照- AI美国抵押贷款
数据科学,Python,亚马逊网络服务(AWS),谷歌云AI, Docker...
Data Brunch
数据科学,社区,通信,生物学,机器学习...
FirmPilot AI Inc
人工智能(AI)自然语言处理(NLP)...

Experience

Availability

Full-time

Preferred Environment

Linux, Visual Studio Code (VS Code), Python, Slack

The most amazing...

...我的旅程的一部分是建立和退出创业公司:风投支持, 端到端人工智能产品,并建立一个遍布法国的强大数据科学社区.

Work Experience

Lead Data Scientist

2021 - PRESENT
贷款快照- AI美国抵押贷款
  • 协调数据科学家和工程师团队. 进行日常站立和项目管理.
  • 向高层领导提供每周报告, specifically CTO, 资本市场总监, and product.
  • 在公司会议演示中推动数据科学部分,并在数据科学计划上实现跨公司合作.
  • 参与战略计划并协调执行工作. 数据团队的营销建议使lead量增加了两倍.
  • 开发定制模型,优化整个销售渠道和二级市场对冲活动的收入和成本关键决策.
  • 利用定制的网络捆绑解决方案从多个在线来源收集数据,并执行竞争对手情报分析.
  • 开发具有独特个性的移情LLM座席,为客户提供个性化服务. Used OpenAI's GPT-3.5、GPT-4 API,以及定制LLM模型.
Technologies: 数据科学,Python,亚马逊网络服务(AWS),谷歌云AI, Docker, Pandas, Scikit-learn, Data Scraping, ETL, Machine Learning, Recommendation Systems, Data Strategy, Data Visualization, Optimization, Linear Optimization, Statistical Analysis, API Integration, OpenAI GPT-4 API, Streamlit, Data Modeling, Forecasting, Amazon SageMaker, Classification, Text Classification, Data Pipelines, GPT, Pricing Models, Data-driven Marketing, 生成预训练变压器(GPT), OCR, OpenAI GPT-3 API, Tableau, PySpark, 人工智能(AI), Jupyter, AI Programming, Programming, User Interface (UI), Integration, Language Models, Data Analysis, 机器学习操作(MLOps), MySQL, Team Leadership, PostgreSQL, Software Architecture, Sentiment Analysis, 大型语言模型(llm), Data Engineering, Predictive Modeling, Probability Theory, Predictive Analytics, Frameworks, Data Analytics, Data Manipulation, Analytics, NumPy, Regression Modeling, Quantitative Analysis, OpenAI, Leadership, APIs, 生成预训练变压器3 (GPT-3), Web Scraping, Research, 生成式人工智能(GenAI), Notion, Data Reporting, Llama 2, 谷歌云平台(GCP), Vertex, Google Cloud, Amazon Machine Learning, 谷歌云机器学习, Prompt Engineering, AI Design, Databricks, Data Mining, Algorithms, Reporting

创始人|社区组织者

2019 - PRESENT
Data Brunch
  • 建立一个强大的数据科学社区,每周开会讨论最先进的数据科学.
  • 发展了一个优秀的数据科学生态系统,并获得了各种深入的专业知识, 包括有成就的研究者, math Olympiad winners, and strong, 有竞争力的数据科学家.
  • 与全国各地的专家联系,帮助数据人员找到工作.
Technologies: 数据科学,社区,通信,生物学,机器学习, Recommendation Systems, Data Visualization, 自然语言处理(NLP), Forecasting, Classification, Text Classification, Data Pipelines, PySpark, 人工智能(AI), CTO, Programming, Data Analysis, SpaCy, Team Leadership, BERT, Custom BERT, 深度强化学习, Predictive Modeling, Probability Theory, Predictive Analytics, Frameworks, Data Analytics, Data Manipulation, Analytics, Pandas, Leadership, Content Writing, Research, 生成式人工智能(GenAI), Notion, Architecture, Technical Writing, Blogging, Data Reporting, AI Design, Algorithms

NLP机器学习开发人员

2023 - 2023
FirmPilot AI Inc
  • 为开发人员开发产品制定了完整的技术策略和详细的规格, 利用OpenAI的ChatGPT, Google's Bard, PaLM2, and Anthropic Claude2, as well as custom open LLM.
  • 研究最先进的技术解决方案并推荐最佳选择, maximizing business impact.
  • 开发了一种利用对抗性法学硕士培训和微调的创新方法.
Technologies: 人工智能(AI)自然语言处理(NLP), Machine Learning, Python, 支持向量机(SVM), pgvector, ChatGPT, Architecture, Technical Writing, Llama 2, Prompt Engineering, AI Design, Algorithms, Reporting

数据科学驻场创始人

2020 - 2021
Entrepreneur First & AptaDeep
  • 被选为EF前3%的学员之一, 这是一个竞争激烈的项目,只选择具有顶尖技能的潜在科技创始人.
  • 向驻场创业者和风投合伙人提供每周报告,最终向投资委员会推荐种子期前融资.
  • 使用Python开发MVP, HTML, CSS, 和Bootstrap一起创建一个SaaS人工智能适配开发平台.
  • 与合适公司的c级高管协调,确保POC/飞行员.
  • 监督商业模式等主题, financial modeling, B2B sales, OKRs, market sizing, 竞争和防御分析, early-stage growth, fundraising, investor decks, venture economics, communication, and customer development.
  • 利用Python进行数据操作,为各种创业和新闻趋势的360度分析执行在线数据收集, scraping, data analysis, and modeling.
技术:沟通, Business, Financial Modeling, 市场机会分析, Data Science, Python, Amazon Web Services (AWS), Docker, Pandas, Scikit-learn, Keras, Deep Learning, Websites, Data Scraping, Computational Biology, Biology, Genomics, ETL, Machine Learning, PyTorch, TensorFlow, Data Strategy, Data Visualization, Optimization, Statistical Analysis, API Integration, Data Modeling, Forecasting, Classification, Data Pipelines, Pricing Models, Tableau, 人工智能(AI), Neural Networks, Web Design, Jupyter, CTO, AI Programming, Programming, User Interface (UI), Integration, Data Analysis, MySQL, Team Leadership, Software Architecture, Predictive Modeling, Probability Theory, Predictive Analytics, Frameworks, Data Analytics, Data Manipulation, Analytics, NumPy, Regression Modeling, Quantitative Analysis, Leadership, APIs, R&D, Quantum Computing, Content Writing, Research, 生成式人工智能(GenAI), Notion, Architecture, Data Reporting, AI Design, Healthcare, Data Mining, Algorithms, Reporting

Co-founder | CTO

2019 - 2020
NewsPill (ex-Sysmo)
  • 确保机器学习的股市波动预测应用于抓取的互联网聊天、技术和上下文数据的异常指标.
  • Redesigned a legacy algorithmic trading system; reusable and structured code architecture, best practices, and design patterns.
  • 监督大量数据科学驱动的案例研究,如特朗普情绪预测器(在法国电视上播出).
  • 使用AWS、Docker、Redis、SQL、Python、Flask、Gunicorn、Nginx和GitLab构建基础设施.
  • 构建了一个聊天机器人框架,可以轻松创建基于规则的聊天机器人.
  • 推介创业公司,并帮助获得BPI的融资 & Rockstart AI. 我们的创业公司在BFM商业电视频道(法国彭博社)上做了专题报道。.
技术:数据科学, Time Series, Options, Scraping, Data Engineering, Amazon Web Services (AWS), Redis, SQL, Flask, Gunicorn, GitLab, Docker, Communication, Fundraising, Chatbots, ETL, Machine Learning, Data Strategy, Data Visualization, Optimization, Statistical Analysis, Real-time Data, 自然语言处理(NLP), API Integration, Data Modeling, Forecasting, Classification, Text Classification, Data Pipelines, Data-driven Marketing, OCR, 人工智能(AI), Neural Networks, Web Design, Financial Modeling, Jupyter, CTO, 聊天机器人对话设计, AI Programming, Programming, User Interface (UI), Integration, ChatGPT, Data Analysis, 机器学习操作(MLOps), 自然语言工具包(NLTK), MySQL, SpaCy, Team Leadership, PostgreSQL, Software Architecture, Sentiment Analysis, TensorFlow, Predictive Modeling, Probability Theory, Predictive Analytics, Frameworks, Data Analytics, Data Manipulation, Analytics, Data Scraping, Pandas, NumPy, Regression Modeling, Quantitative Analysis, Leadership, APIs, R&D, Web Scraping, Content Writing, Research, Architecture, Technical Writing, Data Reporting, Amazon Machine Learning, AI Design, Data Mining, Algorithms, Reporting, 回测交易策略, Trading

Senior Data Scientist

2018 - 2019
面向制造业的Dataswati AI
  • 针对不确定量化的非均匀采样时间序列,为法国大型制造商建立了预测模型.
  • 构建各种自动化数据管道,从原始数据到基于交叉验证的自动特征生成和选择,再到预测.
  • 集成SOTA深度学习:CNN, LSTM,自编码器,迁移学习.
  • 通过在媒体上发表博客,担任技术布道者.com, talks at meetups, 并与法国计算机科学与自动化研究所(Inria)合作.
  • 通过微分进化优化定制算法实现, 政体变化的因果模型, 基于Wasserstein距离的异常检测, 提出了一种新的多域迁移学习方法.
  • 从不同的在线来源收集和抓取数据,以智能地增强数据,并通过必要的外部数据增强机器学习模型.
技术:深度学习, Time Series Analysis, 卷积神经网络, LSTM, ETL, Machine Learning, PyTorch, TensorFlow, Azure, Time Series, Data Visualization, Optimization, Linear Optimization, Statistical Analysis, API Integration, Data Modeling, Forecasting, Classification, Text Classification, Data Pipelines, OCR, 人工智能(AI), Neural Networks, Jupyter, AI Programming, Programming, User Interface (UI), Integration, Data Analysis, 自然语言工具包(NLTK), MySQL, Software Architecture, Computer Vision, Image Processing, Image Analysis, 深度强化学习, Predictive Modeling, Probability Theory, Predictive Analytics, Frameworks, Data Analytics, Data Manipulation, Analytics, Data Scraping, Pandas, NumPy, Regression Modeling, Quantitative Analysis, APIs, 生成对抗网络(GANs), R&D, Web Scraping, Content Writing, Research, Technical Writing, Blogging, Data Reporting, 谷歌云平台(GCP), Google Cloud, AI Design, Data Mining, Algorithms, Reporting

计算生物学和神经科学研究员

2013 - 2017
Inria
  • 开发了一个数据驱动的生物神经元如何使用各种数据集学习的模型, data cleaning, parsing, transformation, and modeling. 对微分方程进行数值模拟、优化和灵敏度分析.
  • 在《欧博体育app下载》、《欧博体育app下载》、《欧博体育app下载》等顶级期刊发表5篇科学论文.
  • 使用Python进行数据分析(NumPy, SciPy, Pandas, scikit-learn, matplotlib等).)和数值优化(PyGMO). 重新设计计算模块以使用Python的F2PY(比Python + SciPy + NumPy快100倍).
Technologies: Python, Pandas, Scikit-learn, F2PY, Sensitivity Analysis, Data Cleaning, Numerical Optimization, Writing & Editing, Science, Matplotlib, Machine Learning, Time Series Analysis, Data Visualization, Optimization, Linear Optimization, Statistical Analysis, Data Modeling, Forecasting, Classification, Data Pipelines, Neural Networks, Web Design, Jupyter, Programming, Data Analysis, 自然语言工具包(NLTK), MySQL, Image Processing, Image Analysis, 深度强化学习, Predictive Modeling, Probability Theory, Predictive Analytics, Data Analytics, Data Manipulation, Analytics, Data Scraping, NumPy, Regression Modeling, Quantitative Analysis, R&D, Content Writing, Research, Technical Writing, Blogging, Data Reporting, Healthcare, Data Mining, Algorithms, Reporting

Trump Mood Predictor

一个有趣的网络应用,可以预测下一条小特推特的情绪.

在我的第一次创业中,它被用作一种营销工具,并证明了情绪分析对股市的作用. 众所周知,市场是由所谓的恐惧和贪婪的动物精神驱动的. 在特朗普担任总统期间, 他的行动和推文正在影响市场,并波及整个经济. 我们构建这个web应用程序是为了说明一些用于预测股市波动的非结构化数据处理和建模技术.

AptaDeep

开发结合分子和人工智能的SaaS平台的POC,用适配体代替昂贵的抗体,用于人工智能药物发现初创公司. 人工智能预测合适的属性,并有助于:
•开发10倍更好的适配体(亲和力,特异性,稳定性或构象变化)
• Optimize pre-SELEX, SELEX, post-SELEX, 以及适配体的后期制作, 以及自定义非selex进程

DeepProPhoto

DeepProPhoto是一款人工智能工具,可以在一分钟内将普通照片转换为专业照片. 这个应用程序可以帮助用户提高专业知名度,找到一份理想的工作,同时节省金钱和时间.

在这个项目中,我做了后前端、AI模型训练、数据抓取等工作.

PsyTrainer

http://t.me/psychotrainerbot
释放你的全部沟通潜力与PsyTrainer,你的个人人工智能心理学家. 由OpenAI的技术和Falcon 7B LLM精心调整的真实心理学家-客户对话雕刻而成.

我为全栈AI开发做出了贡献. 使用的技术是Telegram, Python, SQL, Metabase dashboards, Heroku/AWS, and Falcon, fine-tuned with LoRa, OpenAI's tech.

心理训练师——进化你的对话, transform your beliefs, unlock your potential, 展现沟通的力量.

儿童个性化书籍

我重新定义了个性化的儿童书籍, 通过人工智能驱动的内容创作,将定制化提升到新的高度. 从Wonderbly等平台汲取灵感.我利用先进的技术堆栈来提供无与伦比的体验.

CONTRIBUTIONS
•全栈开发:我使用它来确保无缝的用户体验.
•云基础设施:我依靠AWS实现可扩展性和可靠性.
•ai内容创作:我使用了Python, PyTorch, TensorFlow, spaCy, 以及用于人工智能驱动的文本和插图生成的scikit-learn.
•数据洞察:Metabase促进数据可视化和商业智能.
•营销:谷歌广告增强了客户拓展的营销策略.

KEY ADVANCEMENTS
•AI插图:AI生成的个性化, 迷人的插图-你的孩子放在书里.
•人工智能生成文本:NLP模型精心制作引人入胜的教育叙事.
•推荐:ML算法提供量身定制的图书建议.

Languages

Python, SQL, R, c++, Python 3

Libraries/APIs

Pandas, Scikit-learn, 自然语言工具包(NLTK), TensorFlow, SpaCy, NumPy, PyTorch, PySpark, LSTM, Keras, Matplotlib

Tools

木星,概念,亚马逊SageMaker, Tableau, Slack, MATLAB, GitLab,谷歌云AI, AWS CLI

Paradigms

Data Science, ETL

Platforms

Amazon Web Services (AWS), 谷歌云平台(GCP), Databricks, Azure, Linux, Docker, Visual Studio Code (VS Code), Heroku

Storage

数据管道,MySQL, PostgreSQL,谷歌云,Redis

Other

Optimization, Data Cleaning, Scientific Computing, Science, Deep Learning, Time Series Analysis, Time Series, Chatbots, Data Scraping, Research, Machine Learning, Data Analysis, Data Visualization, Computational Biology, Data Analytics, 人工智能(AI), Data Reporting, Linear Optimization, Statistical Analysis, 自然语言处理(NLP), API Integration, OpenAI GPT-4 API, Data Modeling, Forecasting, Classification, Text Classification, 生成预训练变压器(GPT), OpenAI GPT-3 API, Neural Networks, CTO, 聊天机器人对话设计, AI Programming, Programming, User Interface (UI), Integration, 机器学习操作(MLOps), Language Models, ChatGPT, Team Leadership, Software Architecture, Computer Vision, Sentiment Analysis, Image Processing, Image Analysis, 深度强化学习, Predictive Modeling, Probability Theory, Predictive Analytics, Frameworks, Data Manipulation, Analytics, Regression Modeling, Quantitative Analysis, OpenAI, APIs, 生成预训练变压器3 (GPT-3), 生成对抗网络(GANs), R&D, 生成式人工智能(GenAI), Architecture, Technical Writing, Blogging, Llama 2, Vertex, Amazon Machine Learning, 谷歌云机器学习, Prompt Engineering, AI Design, Data Mining, Algorithms, Reporting, 回测交易策略, Trading, 卷积神经网络, Data Engineering, Financial Modeling, Biology, Genomics, Recommendation Systems, Data Strategy, Dashboards, Web Scraping, Real-time Data, PDF Scraping, Streamlit, GPT, Pricing Models, Data-driven Marketing, OCR, Metabase, BERT, Custom BERT, 大型语言模型(llm), Leadership, Content Writing, Physics, 3D Reconstruction, F2PY, Sensitivity Analysis, Numerical Optimization, Options, Scraping, Gunicorn, Communication, Fundraising, Community, Business, 市场机会分析, Websites, Writing & 编辑,电报机器人,谷歌广告,量子计算,支持向量机(SVM), pgvector

Industry Expertise

Healthcare, Web Design

Frameworks

Flask

2013 - 2016

Ph.D. in Computer Science

Inria Rhône-Alpes︱INSA -法国里昂

2009 - 2013

Master's Degree in Physics

下诺夫哥罗德大学-下诺夫哥罗德,俄罗斯