Before going to our topic let see why Normalization is an essential thing for data. When coming to the data analysis/prediction part.
What is Data Normalization?
Today’s world is engaged with data from our day-to-day life. Say instance buying the products from Amazon, commenting on the products that you have purchased before, Adding to that daily watching web series all in place of data collection. The same data are again useful for forecasting/customer recommendations too. Okay!. Let’s start with why the data to be normalized. In any database world, the data should be normalized before making use of data for…
As humans growing and learning in day-to-day activities right from childhood. As humans acquire knowledge by learning one task. By using the same knowledge we tend to solve the related task. Say in real-time scenarios as
Perceptron algorithm used in supervised machine learning for classification. There are two types of classification. One will classify the data by drawing a straight line called a linear binary classifier. Another will be cannot classify the data by drawing the straight line called a non-linear binary classifier.
In Today’s world time is going fast with the same phase of invention too. The AI solution gives a new platform for machines to think like the human brain. The ANN plays a vital role here basically, it functions the same way how the biological neurons work for humans. To make…
Today we are going to see a very interesting topic. In the modern world, AI plays a vital role in every domain. For instance say, the Retail business having a huge role in ML. The big giant like Walmart, Amazon, Flipkart having key features to make the products well reachable to the customers using machine learning.
Well, How is it possible for such a thing to reach the customers? Yeah here is an example, If the ‘X’ person bought a new shirt from the ‘XYZ’ brand. After a month in the market, new shirts are launched with new sort of…
Naive Bayes is a probabilistic machine learning algorithm. It is used widely to solve the classification problem. In addition to that this algorithm works perfectly in natural language problems (NLP).
About Bayes Theorem
The Naive Bayes algorithm is based on the Bayes theorem. Let’s see what the theorem explains.
The Support Vector Machine is one of the most popular supervised machine learning algorithms. Which is used for classification as well as regression. The goal of the SVM algorithm is to create the best line or decision boundary that can segregate n-dimensional space into classes so that we can easily put the new data point in the correct category in the future. This best decision boundary is called a hyperplane.
SVM chooses the extreme points/vectors that help in creating the hyperplane. These extreme cases are called support vectors, and hence algorithm is termed as Support Vector Machine.
Hierarchical clustering is another unsupervised machine learning algorithm, which is used to group the unlabeled datasets into a cluster and also known as hierarchical cluster analysis or HCA.
In this algorithm, we develop the hierarchy of clusters in the form of a tree, and this tree-shaped structure is known as the dendrogram. Sometimes the results of K-means clustering and hierarchical clustering may look similar, but they both differ depending on how they work. As there is no requirement to predetermine the number of clusters as we did in the K-Means algorithm.
The hierarchical clustering technique has two approaches:
K-Means Clustering is an unsupervised machine learning algorithm which is used to solve the clustering problems in the machine learning. In real-world scenarios, the unlabelled data that might be exists to solve problems. In such cases, the K-means algorithm plays a vital role to solve the problem. Whereby taking unlabelled data dividing into subgroups. Where groups can be termed as a cluster. The grouping can be done by data having similar size, shape, measure, characteristic. Finally, group the data in a separate cluster. Here come K values, how many clusters need to create. There is a separate technique to…
K-nearest neighbors (KNN) is a type of supervised learning algorithm used for both regression and classification. KNN tries to predict the correct class for the test data by calculating the distance between the test data and all the training points. Then select the K number of points which is closet to the test data. The KNN algorithm calculates the probability of the test data belonging to the classes of ‘K’ training data and class holds the highest probability will be selected. In the case of regression, the value is the mean of the ‘K’ selected training points.
Random Forest is a popular machine learning algorithm that belongs to the supervised learning technique. It can be used for both Classification and Regression problems in ML. It is based on the concept of ensemble learning, which is a process of combining multiple classifiers to solve a complex problem and to improve the performance of the model.
As the name suggests, “Random Forest is a classifier that contains a number of decision trees on various subsets of the given dataset and takes the average to improve the predictive accuracy of that dataset.” …
Data Science and Machine Learning enthusiast | Software Architect | Full stack developer