User Tools

Site Tools


network_stuff:machine_learning

This is an old revision of the document!


NOTES ABOUT MACHINE LEARNING AND ARTIFICIAL INTELLIGENCE AI

Notes:
Vectors and matrices are basic for machine learning.

  • Supervised learning: tagging. http://stanford.io/2nRlxxp
    • Traing with all data, tagging it so it can predict future events. Example: train raspberry pi so it can recognise bird images captured with the camera.
  • Semi-supervised learning: reinforcement learning.
    • it does not require training data. But a lot of Try and Error instead.
  • Unsupervised learning: Discovering patterns in unlabelled data
    • Is all about clustering data and inferring relationships.
    • k-Means clustering

  • Deep Learning (ie: neuronal networks) http://stanford.io/2BsQ91Q
    • Layers: Input, Hidden, Output. But also Bias input (poking the hidden layers)


  • Reinforcement Learning: BEYOND SELF SUPERVISION TODO


  • Train the model but also transfer learning: reuse existing models.


  • For model complexity
    • low: bias (flat line(
    • high: a lot of variance (adjust data a lot, not good either




  • Managed datasets with panda's and scikit-learn
  • convolution studies how a shape is modified by another)
  • cnn relu cnn relu cnn …

AI HARDWARE - GPUs

  • AMD Instinct MI series
  • Amazon's Inferentia (for machine learning inference on AWS)
  • Google's TPUs (Tensor Processing Units, custom hardware for Google’s machine learning tasks)
  • Intel Gaudi (designed for deep learning training)
  • NVIDIA GPUs (e.g., A100, H100, used for training and inference in deep learning applications)
  • NVIDIA Tensor Cores (hardware feature within NVIDIA GPUs, optimized for mixed-precision AI workloads)

  • Attention mechanism (just a formula that makes easier for training models)
  • Transformer architecture (hugging face created it)
    • transformers are created in the attention mechanism.
      • precursor was tensorflow-hug

PRACTIAL NOTES ON MODELS:

  • Models multiply matrices.
  • Those matrices are multi-dimensionals : tensors
    • They are made of weight and bias « When defining a model weight and bias are called, generically, parameters.
    • Eg: 100B (all tensor's bias and weights, added together)
  • HF transformers library is ~different from transformers architecture. HF's is framework for loading, training, fine-tuning, and deploying transformer models across NLP and vision tasks. It provides access to thousands of pretrained models, simplifies workflows with task-specific pipelines, and supports custom training on new datasets. Beyond downloading models, Transformers enables production-ready deployment with optimizations for diverse hardware

HUGGINGFACE

  • models, datasets and prototypes
  • open-source and open-weight
  • we can download pre-trained Llama, via ollama and then fine-tune it.
    • One of the reason is so it identifies patterns bettwer (tex, images…). This process is called embedding (Embeddings capture the inherent properties and relationships of the original data in a condensed format and are often used in Machine Learning use cases. See Link « Better classification
network_stuff/machine_learning.1731309544.txt.gz · Last modified: by jotasandoku