What is predict() in R?
With the predict()
function in R, you can make predictions for new, unseen data. This function is an important tool for machine learning.
What is predict() in R used for?
The R function predict()
is a versatile tool used in predictive modelling. It generates predictions for new or existing data points based on a previously designed statistical models, such as a linear regression, logical regression, decision trees and other modelling techniques.
What is the syntax for predict() in R?
R’s predict()
function takes as arguments a trained model and the data points that the prediction should apply to. You can specify different options and parameters based on the type of model used. The result is a vector of predictions that can be useful for various analytical purposes, including evaluating the performance of a model, decision making or illustrating the resulting data.
predict(object, newdata, interval)
Robject
: The trained model that the predictions are applied tonewdata
: The data point for the predictioninterval
: Optional argument for entering the type of confidence interval (confidence
for mean interval,prediction
for predictions)
Example of how to use predict() in R
The following example will illustrate how the predict()
function in R works. We’ll use a user-defined data set with speed and distance values.
Creating and displaying data
custom_data <- data.frame(speed = c(15, 20, 25, 30, 35),
distance = c(30, 40, 50, 60, 70))
# Displaying the custom data frame
print("Custom Data Frame:")
print(custom_data)
RFirst, we’ll create a user-defined data set for evaluating the relationship between speed and distance. We’ll use the function data.frame()
to create a data frame and then define the values for the variables speed
and distance
as c(15, 20, 25, 30, 35)
and c(30, 40, 50, 60, 70)
respectively.
After we’ve created the data set, we’ll display it using the print()
function. That way we can check the structure and the assigned values of our new data frame.
Output:
"Custom Data Frame:"
speed distance
1 15 30
2 20 40
3 25 50
4 30 60
5 35 70
RCreating a linear model
# Creating a linear model for the custom data frame
custom_model <- lm(distance ~ speed, data = custom_data)
# Printing the model results
print("Model Results:")
print(summary(custom_model))
ROutput:
"Model Results:"
Call:
lm(formula = distance ~ speed, data = custom_data)
Residuals:
1 2 3 4 5
-2 -1 1 0 2
Coefficients:
(Intercept) -10.00 15.81 -0.632 0.55897
speed 2.00 0.47 4.254 0.01205
RIn the output, we see a linear model (custom_model
) that was generated for the data set and models the relationship between speed and distance. We get the result of the model, including coefficients and statistical information.
Defining new speed values and making predictions
# Creating a data frame with new speed values
new_speed_values <- data.frame(speed = c(40, 45, 50, 55, 60))
# Predicting future distance values using the linear model
predicted_distance <- predict(custom_model, newdata = new_speed_values)
RWe’ve now created another data set (new_speed_values
) with new values for speed. Then we used R predict()
to make predictions for the corresponding distance values using the linear model we created above.
Displaying the predictions
# Displaying the predicted values
print("Predicted Distance Values:")
print(predicted_distance)
RThe output shows the distance values predicted based on speed:
"Predicted Distance Values:"
1 2 3 4 5
80.0000 90.0000 100.0000 110.0000 120.0000
RIf you want to learn about processing strings for text manipulation and data cleaning in R, take a look at our tutorials on R gsub and sub and R substring.
- 99.9% uptime and super-fast loading
- Advanced security features
- Domain and email included