Marc Matt, Developer in Hamburg, Germany
Marc is available for hire
Hire Marc

Marc Matt

Verified Expert  in Engineering

Data Engineer and Developer

Location
Hamburg, Germany
Toptal Member Since
January 5, 2021

Marc是一名对数据充满热情的数据工程师,在领导团队和构建专注于信息技术的数据平台方面拥有15年以上的经验, real estate, and services industries. 他创建了一个基于python的AVRO模式生成器,使方案的部分可重用. Marc excels with automation, integrations, analysis, the building of models, statistics, big data, CI/CD pipelines, and data modeling.

Portfolio

RTL Deutschland GmbH
Python 3, Kubernetes, Argo CD, FastAPI, GitLab CI/CD, BigQuery
Bold Metrics Inc.
SQL, Tableau, Python,数据分析,数据构建工具(dbt), Apache气流...
MediaMarktSaturn Retail Group
Python 3, Google Cloud, Google Kubernetes Engine (GKE), Apache NiFi...

Experience

Availability

Part-time

Preferred Environment

Apache气流,Tableau服务器,Tableau, SQL, Pandas, Python, Apache Beam, Git, Linux

The most amazing...

...我开发的应用程序可以实时提供姿势估计数据,以帮助优化客户的健身目标.

Work Experience

Data Engineer

2024 - PRESENT
RTL Deutschland GmbH
  • 设置AlloyDB,为媒体消费提供实时推荐.
  • Optimized the recommender ranking for the streaming platform.
  • 构建微服务,实时为客户提供推荐.
技术:Python 3, Kubernetes, Argo CD, FastAPI, GitLab CI/CD, BigQuery

Senior Data Analyst

2023 - 2023
Bold Metrics Inc.
  • Created a template for ad hoc reporting for all clients.
  • 使用Amazon Kinesis设计并实现了流数据进入数据仓库, Lambda, and Python.
  • 在Redshift数据仓库中优化和标准化转换.
Technologies: SQL, Tableau, Python,数据分析,数据构建工具(dbt), Apache气流, Amazon Kinesis, AWS Lambda, Serverless Framework

Data Engineer

2022 - 2022
MediaMarktSaturn Retail Group
  • 建立了全国配送中心的供应链监控系统.
  • 对所有物流服务供应商实施api,并将其转换为公司范围内的报告.
  • 在GKE上使用Apache NiFi建立一个实时订单跟踪系统.
Technologies: Python 3, Google Cloud, Google Kubernetes Engine (GKE), Apache NiFi, Google BigQuery, SQL, Data Build Tool (dbt), Docker, Apache Airflow, Apache Beam, Google Data Studio, Database Schema Design, Data Management, Terraform, Google Cloud Platform (GCP), Google Cloud Functions, Cloud Run, Cloud Tasks, Node.js, APIs, Serverless, Data Lakes, Data Visualization, Kubernetes, Scaling, Dashboards, Data Wrangling, Azure Databricks, Database Architecture, ETL Tools

Cloud Data Engineer and Architect

2021 - 2022
Spin (Tier Mobility) - Main
  • Designed and established an MLOps workflow with Google Vertex AI.
  • Operationalized ML models for real-time use cases.
  • Prepared the migration of DWH from BigQuery to Snowflake.
  • 建立交通违章事件的操作支援工具.
Technologies: SQL, ETL, Cloud Architecture, Google Cloud Platform (GCP), Big Data, Architecture, Python, Snowflake, Hadoop, REST APIs, Apache Airflow, Git, DevOps, Microservices, Google BigQuery, Big Data Architecture, Machine Learning Operations (MLOps), CI/CD Pipelines, Cloud Security, Data Warehousing, Data Warehouse Design, Apache Avro, Kubeflow, Fivetran, Database Schema Design, Data Management, Terraform, Google Cloud Functions, Cloud Run, APIs, Serverless, Data Lakes, Kubernetes, Scaling, Data Wrangling, Database Architecture, ETL Tools

ETL Engineer

2021 - 2021
Food Marketing Company
  • Parsed JSON data in Talend and loaded it into Redshift.
  • Integrated data from web APIs with Talend into Redshift.
  • 使用Talend转换客户数据并将其加载到Salesforce.
技术:Talend, JSON, Amazon Redshift Spectrum, Redshift, api, Data Wrangling, ETL Tools

Data Engineer

2021 - 2021
Janus
  • Translated legacy ETL pipelines to scalable AWS Glue jobs.
  • Automated resource deployment using AWS CloudFormation.
  • 在PySpark中设计和构建框架,使将来添加管道更容易.
Technologies: AWS Glue, Spark, SQL, Amazon Aurora, Python, Database Schema Design, Data Management, Serverless, Apache Spark, PySpark, Scaling, Data Wrangling, Database Architecture, ETL Tools, AWS IAM

Senior Data Engineer

2021 - 2021
Emma
  • 为数据平台设计了一个新的数据输入API,支持流分析.
  • 使用Kinesis设置binlog流处理和实时事件解析, Lambda, and Kinesis Data Firehose.
  • 通过分析查询和表来优化Redshift中的数据加载,以添加优化的排序和磁盘键.
Technologies: Python, Amazon Kinesis, Amazon Web Services (AWS), Redshift, Amazon Redshift Spectrum, Matillion ETL for Redshift, AWS Lambda, Parquet, AWS Fargate, Docker, Databases, Database Schema Design, Data Management, Terraform, APIs, Serverless, Kubernetes, Scaling, Data Wrangling, Database Architecture, ETL Tools, Amazon Elastic MapReduce (EMR), Amazon EKS, AWS IAM

Data Specialist

2020 - 2021
Ear-Reality GmbH
  • 开发了一个基于Kinesis和Athena的数据湖,包括在Metabase中嵌入报表.
  • 将生产系统转移到无服务器可扩展架构.
  • 使用Python和Locust对应用程序进行自动负载测试.io.
Technologies: Amazon Web Services (AWS), SQL, Amazon Kinesis, Amazon Athena, AWS Elastic Beanstalk, Docker, Python, AWS CloudFormation, Databases, Data Reporting, Business Intelligence (BI), Database Schema Design, Data Management, Terraform, APIs, Data Visualization, Dashboards, Data Wrangling

Senior Data Engineer

2018 - 2020
Engel & Völkers
  • 设计并搭建了一个数据平台,包括工具选择和数据建模.
  • 建立了一个TensorFlow模型来预测实时环境中的属性值.
  • 实现了CI/CD管道来自动部署数据平台的所有特性.
Technologies: Jenkins, SQL, Tableau, BigQuery, Apache Beam, Apache Airflow, TensorFlow, Google Kubernetes Engine (GKE), Docker, Python, Data Engineering, Data Architecture, Data Analysis, NoSQL, Google BigQuery, Data Pipelines, ETL, Data Warehousing, Data Warehouse Design, Database Modeling, Data Modeling, Google Cloud Platform (GCP), Google Cloud SQL, Data Science, Databases, Data Reporting, Business Intelligence (BI), Database Schema Design, Data Management, Google Cloud Functions, Cloud Run, APIs, Serverless, Data Lakes, Data Visualization, Scaling, Dashboards, Data Wrangling, Database Architecture, ETL Tools

Head of Data Engineering | Machine Learning

2014 - 2018
Surf Media
  • 领导一个六人的团队,并负责他们的个人发展.
  • 设计大数据系统和数据湖,包括工具选择和数据建模.
  • 为推荐引擎和欺诈的开发设计数据管道和模型选择. The recognition systems work in a real-time environment.
  • Created the technology roadmap. Oversaw the advancement of all affected data systems.
Technologies: TensorFlow, RabbitMQ, Apache Avro, Tableau, Hortonworks Data Platform (HDP), SQL, Apache NiFi, Apache HAWQ, Talend, Python, Data Engineering, PostgreSQL, Amazon S3 (AWS S3), AWS Lambda, Data Architecture, Amazon Web Services (AWS), NoSQL, Data Pipelines, ETL, Data Warehouse Design, Data Warehousing, Database Modeling, Data Modeling, Talend ETL, Data Science, Databases, Data Reporting, Business Intelligence (BI), Database Schema Design, Data Management, APIs, Spark, Data Visualization, PySpark, Scaling, Dashboards, Data Wrangling, Database Architecture, ETL Tools

Business Intelligence Analyst

2012 - 2014
Surf Media
  • 为由五家公司组成的公司集团设计、开发和运营DWH.
  • Developed a statistical model for predicting orders.
  • 分析客户,了解如何在社交网络中优化收益.
Technologies: Tableau, Perl, Python, MySQL, Data Engineering, PostgreSQL, Data Architecture, Amazon Web Services (AWS), Data Pipelines, ETL, Data Warehouse Design, Data Warehousing, Database Modeling, Data Modeling, Talend ETL, Databases, Data Reporting, Business Intelligence (BI), Database Schema Design, Data Management, APIs, Spark, Data Visualization, Apache Spark, PySpark, Scaling, Dashboards, Data Wrangling, ETL Tools

Database Consultant

2010 - 2012
EOS Information Services, GmbH.
  • 为风险管理中的决策引擎设计、开发和操作DWH.
  • Designed processes for risk management.
  • 使用Perl和Uniserv完成了地址管理过程的构思和开发.
Technologies: Oracle, Java, Perl, Data Engineering, Data Architecture, Data Analysis, Data Pipelines, ETL, Data Warehousing, Data Warehouse Design, Database Modeling, Databases, Business Intelligence (BI), Database Schema Design, Data Management, ETL Tools

Datawarehousing Consultant

2009 - 2010
Key-Work Consulting, GmbH.
  • Migrated the sales reporting for a mailorder company.
  • 开发了一个统计模型来优化邮购公司的销售计划.
  • Built a statistical model for a dynamic shipping schedule.
Technologies: Python, SQL, SQL Server 2010, Data Engineering, Data Analysis, Data Pipelines, ETL, Data Warehousing, Data Warehouse Design, Database Modeling, Data Modeling, Databases, Data Reporting, Business Intelligence (BI), Data Management, Dashboards, ETL Tools

Database Management

2008 - 2009
Coxulto Marketing Solutions, GmbH.
  • Defined and selected target groups for marketing campaigns.
  • Completed affinity analysis for the complete customer base.
  • 管理和操作地址数据库,包括重复终止.
Technologies: Perl, SQL, Data Engineering, Data Analysis, ETL, Data Warehouse Design, Data Warehousing, Databases, Data Reporting, Business Intelligence (BI), Dashboards, ETL Tools

Lead of Business Intelligence Consumer Products

2007 - 2008
1&1 Internet A
  • 协调和优先处理商业智能团队的所有任务.
  • Designed and developed KPI reports for the board of directors.
  • 分析客户结构,建立客户流失预测模型.
Technologies: Java, Perl, Data Engineering, Data Analysis, ETL, Data Warehouse Design, Data Warehousing, Database Modeling, Data Modeling, Databases, Data Reporting, Business Intelligence (BI), Data Visualization, Dashboards, ETL Tools

Business Intelligence Analyst

2003 - 2007
1&1 Internet AG
  • 设计和开发客户和合同库存的自动报告系统, as well as internet usage and customer behavior.
  • 将公司网站的客户使用数据整合到DWH中.
  • 协调管理部门和开发部门之间的所有任务.
  • 分析所有新老客户活动的有效性.
Technologies: Java, MySQL, Perl, Data Engineering, Data Analysis, ETL, Data Warehouse Design, Data Warehousing, Databases, Data Reporting, Business Intelligence (BI), Data Visualization, Dashboards, ETL Tools

AVRO Schema Generator

http://gitlab.com/datascientists.info/avro-generator
A Python-based AVRO schema generator I developed myself, that adds the ability to make parts of a schema reusable. 这是有用的,因为AVRO本身不提供此功能.

If certain data structures are used in several schemas, 该工具只提供一次定义这些结构,然后在多个模式上重用它们的能力.

Evalution of Property Value

我构建了一个基于Python/ tensorflow的深度学习模型和API,用于根据地理位置和其他属性预测房地产价格. 该值是使用集成在客户端网站上的Flask REST API实时预测的.

Design and Set-up of Data Platform

整合社交媒体公司所有相关数据的平台, where I designed and helped setting up various tools. 该平台为操作决策支持和分析工作负载提供了对所有数据的实时访问.
AUGUST 2019 - AUGUST 2021

Google Cloud Certified - Professional Data Engineer

Google

Libraries/APIs

Pandas, PySpark, TensorFlow, REST APIs, Node.js

Tools

BigQuery, Apache HAWQ, Apache Avro, Git, Apache Beam, Tableau, Apache Airflow, Jenkins, Apache NiFi, RabbitMQ, Microsoft Excel, Terraform, Amazon Elastic MapReduce (EMR), Amazon EKS, AWS IAM, Google Kubernetes Engine (GKE), Talend ETL, Amazon Athena, AWS CloudFormation, Amazon Redshift Spectrum, Matillion ETL for Redshift, AWS Fargate, AWS Glue, GitLab CI/CD

Languages

Python, SQL, Perl, Java, XML, Snowflake, Python 3, TypeScript

Platforms

Amazon Web Services (AWS), Cloud Run, Linux, Docker, Talend, Hortonworks Data Platform (HDP), Oracle, AWS Lambda, Google Cloud Platform (GCP), Kubernetes, AWS Elastic Beanstalk, Kubeflow

Paradigms

ETL、商业智能(BI)、数据科学、DevOps、微服务

Storage

MySQL, Google Cloud, Database Modeling, Redshift, Databases, Database Architecture, SQL Server 2010, Data Pipelines, Amazon S3 (AWS S3), PostgreSQL, Google Cloud SQL, Data Lakes, Apache Hive, HDFS, NoSQL, Amazon Aurora, JSON

Frameworks

Spark, Apache Spark, Flask, Django, Hadoop, Serverless Framework

Other

Data Visualization, Data Analysis, Data Architecture, Data Engineering, Data Warehousing, Data Modeling, Data Warehouse Design, Data Reporting, Database Schema Design, Data Management, Google Cloud Functions, APIs, Data Wrangling, ETL Tools, Tableau Server, Google BigQuery, Data Profiling, Google Data Studio, Fivetran, Serverless, Scaling, Dashboards, Amazon Kinesis, Parquet, Cloud Architecture, Big Data, Architecture, Big Data Architecture, Machine Learning Operations (MLOps), CI/CD Pipelines, Cloud Security, Data Build Tool (dbt), Cloud Tasks, Azure Databricks, Argo CD, FastAPI

Collaboration That Works

How to Work with Toptal

在数小时内,而不是数周或数月,我们的网络将为您直接匹配全球行业专家.

1

Share your needs

在与Toptal领域专家的电话中讨论您的需求并细化您的范围.
2

Choose your talent

在24小时内获得专业匹配人才的简短列表,以进行审查,面试和选择.
3

Start your risk-free talent trial

与你选择的人才一起工作,试用最多两周. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring