Chapter – 1
Introduction To Machine Learning
Contents:
v Machine
Learning Definition
v Types
of Machine Learning
v Problem
Solved with Machine learning
v Machine
Learning Python Tools & Libraries
v Machine
Learning Challenges
Machine Learning Definition:
Ø Machine
learning is the subset of Artificial Intelligence (AI), which is a broad branch
of computer science for building machines that can do human tasks.
Ø AI is
about building machines that can perform tasks that a human would typically
perform.
Ø Machine
learning focuses on data and algorithm.
Ø It imitates
the way that human learn.
Ø Machine
learning is the scientific study of algorithms and statistical models to
perform a task by using inference instead of instructions.
Machine Learning Flow
Types of Machine Learning
Types of Machine Learning
1.
Unsupervised Learning:
Ø In unsupervised learning, labels are not
provides because you don’t know all the variables and patterns.
Ø
Models find the pattern from data and it must
uncover and create labels itself.
There are commonly two sub categories of unsupervised learning:
Ø Clustering: Based on the similar feature or characteristics data are grouped into different clusters. Some most common uses of this technique are: Market Segmentation, Statistical data analysis, Social network analysis, Image segmentation, Anomaly detection, etc.
Ø Dimensionality Reduction: It refers to techniques
for reducing the number of input variables in training data. When
dealing with high dimensional data, it is often useful to reduce the
dimensionality by projecting the data to a lower dimensional subspace which
captures the “essence” of the data. This is called dimensionality reduction.
2. Supervised Learning:
Ø It is called supervised learning because it needs a supervisor a teacher who can show the right answer to the model.
Ø Model is trained on labelled data to accurately identify results.
Ø is a popular type of ML because it’s widely applicable.
Ø Like any student, a supervised algorithm learns by example. It needs a teacher who uses training data to help it determine the patterns and relationships between the inputs and outputs.
Ø Classification: Classification is a process of categorizing a given set of data into classes. It can be performed on both structured or unstructured data. Examples of classification problems include: Given an example, classify if it is a spam or not. Classification problem have two types:
Supervised learning can be broadly classified into two categorie
· Binary Classification: Those type of classification in which the target variable is limited to two options. Example: In fraudulent detection the target variable is limited to two i.e. fraudulent or non-fraudulent. This type of classification are known as binary classification.
· Multiclass Classification: These ML problems classify an observation into one of three or more categories. For example, classification using features extracted from a set of images of fruit, where each image may either be of an orange, an apple, or a pear.
Ø Regression: In a regression problem, we are no longer mapping an input to a defined number of categories. Instead, we are mapping inputs to a continuous value like an integer. One example of an ML regression problem is predicting the price of a company's stock. For instance, a regression-based algorithm.
3. Reinforcement Learning:
Ø This type of machine learning continuously improves its model by mining feedback from previous iterations.
Ø In general, a reinforcement learning agent is able to perceive and interpret its environment, take actions and learn through trial and error.
Ø Current use cases include, but are not limited to, the following:
· gaming
· resource management
· personalized recommendations
· robotics
- Jupyter Notebook : The Jupyter Notebook is an interactive environment tool for running code in the browser. It is a great tool for exploratory data analysis and is widely used by data scientist.
- NumPy: NumPy is one of the fundamental packages for scientific computing in Python. It contains functionality for multidimensional arrays, high-level mathematical functions such as linear algebra operations and the Fourier transform, and pseudorandom number generators.
- SciPy: SciPy is a collection of functions for scientific computing in Python. It provides, among other functionality, advanced linear algebra routines, mathematical function optimization, signal processing, special mathematical functions, and statistical distributions. scikit-learn draws from SciPy’s collection of functions for implementing its algorithms. The most important part of SciPy for us is scipy.sparse: this provides sparse matrices, which are another representation that is used for data in scikitlearn. Sparse matrices are used whenever we want to store a 2D array that contains mostly zeros.
- Matplotlib: matplotlib is the primary scientific plotting library in Python. It provides functionsfor making publication-quality visualizations such as line charts, histograms, scatter plots, and so on.
- pandas: pandas is a Python library for data wrangling and analysis. It is built around a datastructure called the DataFrame that is modeled after the R DataFrame.
- Poor quality of data
- Underfitting of Training Data
- Overfitting of Training Data
- Machine Learning is a complex process
- Lack of training data
- Slow implementation
- Imperfections in the algorithm when data grows.
0 Comments