Okay, so yesterday I was messing around trying to predict the Elena Rybakina match. You know, just for fun, seeing if I could actually get it right.

First things first, I grabbed some data. I scraped some match history, rankings, you name it. Figured more data is better, right? Used some Python with Beautiful Soup to get the info, nothing too fancy. I’m no expert coder, just trying to make something happen.
Then came the feature engineering. This was a pain! I tried to distill all that raw data into something useful. Win rates on different surfaces, head-to-head records, recent form… I even threw in some stuff like average aces per match, because why not? It felt like pure guesswork at times, just trying to see what sticks. I was copying some examples from other sports prediction models, tweaking them for tennis.
Next up, model selection. I’m no machine learning whiz, so I stuck with some simple stuff. Tried a logistic regression, a random forest, and even a basic neural network (using scikit-learn and TensorFlow, mostly copy-pasting code from tutorials). Spent a bunch of time fiddling with parameters, trying to get the best accuracy on a validation set.
After that, I trained the models with the data I got and started predicting. Most of them were consistently wrong. The logistic regression was a total disaster, the random forest was slightly better, and the neural network was marginally better than the random forest, but the predictions were still very unreliable. Spent a good chunk of time trying different hyperparameters, and trying some more complex models, but couldn’t get the needle to move much.
Finally, I tested it on the actual match. It was a total flop, none of the models predicted the outcome correctly! Rybakina lost, and my models confidently predicted she would win! So much for my amazing AI skills.

Lessons learned? Predicting tennis is hard! Maybe I need way more data, or better features, or fancier models. Or maybe it’s just luck. Whatever the reason, it was a fun little project, even if it didn’t work out.
- Grab data
- Feature engineering
- Model selection
- Model training
- Make Prediction
Future Improvements
- Get way more data
- Better data sources
- Better features
- Try some other models