Python

Python is a high-level, interpreted, general-purpose programming language. Its design philosophy emphasizes code readability with the use of significant indentation.
Python is dynamically-typed and garbage-collected. It supports multiple programming paradigms, including structured (particularly procedural), object-oriented and functional programming. It is often described as a “batteries included” language due to its comprehensive standard library.

Most popular ibraries:

  • pandas – fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language. pandas is a software library written for the Python programming language for data manipulation and analysis. In particular, it offers data structures and operations for manipulating numerical tables and time series. It is free software released under the three-clause BSD license. More about pandas
  • NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more. More about NumPy
  • Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. Matplotlib makes easy things easy and hard things possible. More about Matplotlib

I refresh my Python skills at Datacamp. Datacamp has Python courses, tutorials at all levels, projects, and a workspace to ensure my progression on my Python journey.

As a Data Analyst, Data Engineer or Data Scientist I can:

  • import, clean, manipulate, and visualize data with some of the most popular Python libraries, including pandas, NumPy, Matplotlib, Seaborn and many more,
  • build an effective data architecture, streamline data processing, and maintain large-scale data systems,
  • create data engineering pipelines, automate common file system tasks, and build a high-performance database.
Microsoft Azure

My Python Learning Path

With Data Camp, Kaggle and others I build my skills and experience and validate my knowledge:

Python Fundamentals (Datacamp) 15 hours (skill trackcertificate)

In this track, I learned the Python basics I need to start on my programming journey, including how to clean real-world data ready for analysis, use data visualization libraries, and even how to write your own Python functions. I also learned how to store, manipulate, and explore data using NumPy, how to visualize my data using Matplotlib, manipulate DataFrames and dictionaries using pandas, and write my own functions and list comprehension.

Python is a general-purpose programming language that is becoming ever more popular for data science. Companies worldwide are using Python to harvest insights from their data and gain a competitive edge. This course focused on Python specifically for data science. I learned about powerful ways to store and manipulate data, and helpful data science tools to begin conducting my own analyses.

In this course I discovered how dictionaries offer an alternative to Python lists, and why the pandas dataframe is the most popular way of working with tabular data. In the second chapter of this course, I found out how I can create and manipulate datasets, and how to access them using these structures.

In this course I learned the art of function writing in this first Python Data Science Toolbox course. I came out of this course being able to write your very own custom functions, complete with multiple parameters and multiple return values, along with default arguments and variable-length arguments. I gained insight into scoping in Python and was able to write lambda functions and handle errors in your function writing practice.

In this second Python Data Science Toolbox course, I continued to build your Python data science skills. I learned about iterators, objects I have already encountered in the context of for loops. Then I learned about list comprehensions, which are extremely handy tools for all data scientists working in Python.

Data Manipulation with Python (Datacamp) 16 hours (skill trackcertificate)

Real-world data is messy. That’s why libraries like pandas are so valuable. Using pandas, I can take the pain out of data manipulation by extracting, filtering, and transforming data in DataFrames, clearing a path for quick and reliable data analysis. In this track I learned how to prepare real-world data for analysis and grow my expertise as I work with multiple DataFrames using pandas. I also gained hands-on experience of how to combine, merge, and create visualizations. I also learned all about NumPy arrays and use New York City’s tree census data to create, sort, filter, and update arrays.

pandas is the world’s most popular Python library, used for everything from data manipulation to data analysis. In this course, I learned how to manipulate DataFrames, as I extract, filter, and transform real-world datasets for analysis. Using pandas I explored all the core data science concepts. I also learned how to import, clean, calculate statistics, and create visualizations – using pandas to add to the power of Python.

Often data is in a human-readable format, but it’s not suitable for data analysis. This is where pandas can help—it’s a powerful tool for reshaping DataFrames into different formats. In this course, I grew my data scientist and analyst skills as I learned how to wrangle string columns and nested data contained in a DataFrame. I also learned how to reshape a DataFrame from wide to long format, stack and unstack rows and columns, and get descriptive statistics of a multi-index DataFrame.

Being able to combine and work with multiple datasets is an essential skill for any aspiring Data Scientist. pandas is a crucial cornerstone of the Python data science ecosystem. In this course I learned how to handle multiple DataFrames by combining, organizing, joining, and reshaping them using pandas.

NumPy is an essential Python library. TensorFlow and scikit-learn use NumPy arrays as inputs, and pandas and Matplotlib are built on top of NumPy. In this Introduction to NumPy course, I became a master wrangler of NumPy’s core object: arrays. I created, sort, filter, and update arrays. I discovered why NumPy is so efficient and use broadcasting and vectorization to make my NumPy code even faster.

Data Analyst with Python (Datacamp) 36 hours (skill track ⇒ certificate)

In this course I began my data analyst training with interactive exercises and get hands-on with some of the most popular Python libraries, including pandas, NumPy, Seaborn, and many more. I learned why Python for data analysis is so popular and worked with real-world datasets to grow my data manipulation and exploratory data analysis skills. I also learned key statistics skills, like hypothesis.

Python is a general-purpose programming language that is becoming ever more popular for data science. Companies worldwide are using Python to harvest insights from their data and gain a competitive edge. This course focused on Python specifically for data science. I learned about powerful ways to store and manipulate data, and helpful data science tools to begin conducting my own analyses.

In this course I discovered how dictionaries offer an alternative to Python lists, and why the pandas dataframe is the most popular way of working with tabular data. In the second chapter of this course, I found out how I can create and manipulate datasets, and how to access them using these structures.

pandas is the world’s most popular Python library, used for everything from data manipulation to data analysis. In this course, I learned how to manipulate DataFrames, as I extract, filter, and transform real-world datasets for analysis. Using pandas I explored all the core data science concepts. Using real-world data, including Walmart sales figures and global temperature time series, I learned how to import, clean, calculate statistics, and create visualizations—using pandas to add to the power of Python.

pandas is a crucial cornerstone of the Python data science ecosystem, with Stack Overflow recording 5 million views for pandas questions. In this course I learned how to handle multiple DataFrames by combining, organizing, joining, and reshaping them using pandas. I worked with datasets from the World Bank and the City Of Chicago. I finished the course with a solid skillset for data-joining in pandas.

Statistics is the study of how to collect, analyze, and draw conclusions from data. It’s a hugely valuable tool that I can use to bring the future into focus and infer the answer to tons of questions. In this course, I discovered how to answer questions like these as you grow your statistical skills and learn how to calculate averages, use scatterplots to show the relationship between numeric values, and calculate correlation. I also learned how to tackle probability, the backbone of statistical reasoning, and learned how to use Python to conduct a well-designed study to draw my own conclusions from data.

Seaborn is a powerful Python library that makes it easy to create informative and attractive data visualizations. In this course I learned how to explore this library and create Seaborn plots based on a variety of real-world data sets, including exploring how air pollution in a city changes through the day and looking at what young people like to do in their free time. This data will gave me the opportunity to find out about Seaborn’s advantages first hand, including how I can easily create subplots in a single figure and how to automatically calculate confidence intervals.

Exploratory data analysis is a process for exploring datasets, answering questions, and visualizing results. This course presented the tools I you need to clean and validate data, to visualize distributions and relationships between variables, and to use regression models to predict and explain. I explored data related to demographics and health, including the National Survey of Family Growth and the General Social Survey. But the methods I learned apply to all areas of science, engineering, and business. I used Pandas, a powerful library for working with data, and other core Python libraries including NumPy and SciPy, StatsModels for regression, and Matplotlib for visualization.

Sampling in Python is the cornerstone of inference statistics and hypothesis testing. It’s a powerful skill used in survey analysis and experimental design to draw conclusions without surveying an entire population. In this Sampling in Python course, I discovered when to use sampling and how to perform common types of sampling—from simple random sampling to more complex methods like stratified and cluster sampling. I also learned how to estimate population statistics and quantify uncertainty in my estimates by generating sampling distributions and bootstrap distributions.

Hypothesis testing lets me answer questions about my datasets in a statistically rigorous way. In this course, I learned how and when to use common tests like t-tests, proportion tests, and chi-square tests. Working with real-world data, including Stack Overflow user feedback and supply-chain data for medical supply shipments, I learned gain a deep understanding of how these tests work and the key assumptions that underpin them. I also discovered how non-parametric tests can be used to go beyond the limitations of traditional hypothesis tests.

Data Scientist with Python (Datacamp) 90 hours (skill track ⇒ certificate)

In this track, I learned how Python language allows you to import, clean, manipulate, and visualize data – all integral skills for any aspiring data professional or researcher. Starting with the Python essentials for data science, I worked through interactive exercises that test your abilities. I got hands-on with some of the most popular Python libraries for data science, including pandas, Seaborn, Matplotlib, scikit-learn, and many more.

Python is a general-purpose programming language that is becoming ever more popular for data science. Companies worldwide are using Python to harvest insights from their data and gain a competitive edge. This course focused on Python specifically for data science. I learned about powerful ways to store and manipulate data, and helpful data science tools to begin conducting my own analyses.

In this course I discovered how dictionaries offer an alternative to Python lists, and why the pandas dataframe is the most popular way of working with tabular data. In the second chapter of this course, I found out how I can create and manipulate datasets, and how to access them using these structures.

pandas is the world’s most popular Python library, used for everything from data manipulation to data analysis. In this course, I learned how to manipulate DataFrames, as I extract, filter, and transform real-world datasets for analysis. Using pandas I explored all the core data science concepts. I also learned how to import, clean, calculate statistics, and create visualizations – using pandas to add to the power of Python.

Being able to combine and work with multiple datasets is an essential skill for any aspiring Data Scientist. pandas is a crucial cornerstone of the Python data science ecosystem. In this course I learned how to handle multiple DataFrames by combining, organizing, joining, and reshaping them using pandas.

Statistics is the study of how to collect, analyze, and draw conclusions from data. It’s a hugely valuable tool that I can use to bring the future into focus and infer the answer to tons of questions. In this course, I discovered how to answer questions like these as you grow your statistical skills and learn how to calculate averages, use scatterplots to show the relationship between numeric values, and calculate correlation. I also learned how to tackle probability, the backbone of statistical reasoning, and learned how to use Python to conduct a well-designed study to draw my own conclusions from data.

Visualizing data in plots and figures exposes the underlying patterns in the data and provides insights. Good visualizations also help me communicate my data to others, and are useful to data analysts and other consumers of the data. In this course, I learned how to use Matplotlib, a powerful Python data visualization library. Matplotlib provides the building blocks to create rich visualizations of many different kinds of datasets. I learned how to create visualizations for different kinds of data and how to customize, automate, and share these visualizations.

Seaborn is a powerful Python library that makes it easy to create informative and attractive data visualizations. In this course I learned how to explore this library and create Seaborn plots based on a variety of real-world data sets, including exploring how air pollution in a city changes through the day and looking at what young people like to do in their free time. This data will gave me the opportunity to find out about Seaborn’s advantages first hand, including how I can easily create subplots in a single figure and how to automatically calculate confidence intervals.

In this course I learned the art of function writing in this first Python Data Science Toolbox course. I came out of this course being able to write your very own custom functions, complete with multiple parameters and multiple return values, along with default arguments and variable-length arguments. I gained insight into scoping in Python and was able to write lambda functions and handle errors in your function writing practice.

In this second Python Data Science Toolbox course, I continued to build your Python data science skills. I learned about iterators, objects I have already encountered in the context of for loops. Then I learned about list comprehensions, which are extremely handy tools for all data scientists working in Python.

Seaborn is a visualization library that is an essential part of the python data science toolkit. In this course, I learned how to use seaborn’s sophisticated visualization tools to analyze multiple real world datasets including the American Housing Survey, college tuition data, and guests from the popular television series, The Daily Show. I also were able to use seaborn functions to visualize your data in several different formats and customize seaborn plots for my unique needs.

Exploratory data analysis is a process for exploring datasets, answering questions, and visualizing results. This course presented the tools I you need to clean and validate data, to visualize distributions and relationships between variables, and to use regression models to predict and explain. I explored data related to demographics and health, including the National Survey of Family Growth and the General Social Survey. But the methods I learned apply to all areas of science, engineering, and business. I used Pandas, a powerful library for working with data, and other core Python libraries including NumPy and SciPy, StatsModels for regression, and Matplotlib for visualization.

  • Working with Categorical Data in Python (course ⇒ certificate)

In this course, I learned how to manipulate and visualize categorical data using pandas and seaborn. Through hands-on exercises, I got to grips with pandas’ categorical data type, including how to create, delete, and update categorical columns. I also worked with a wide range of datasets including the characteristics of adoptable dogs, Las Vegas trip reviews, and census data to develop your skills at working with categorical data.

 

My badges:

Some articles about Programming