update mle questions

eavelardev · Nov 2, 2022 · f3b7013 · f3b7013
1 parent f7f8eb0
commit f3b7013
Show file tree

Hide file tree

Showing 4 changed files with 4,019 additions and 851 deletions.
diff --git a/mle_certificate/Certification Study Group - question.png b/mle_certificate/Certification Study Group - question.png
diff --git a/mle_certificate/certification_exam_guide.ipynb b/mle_certificate/certification_exam_guide.ipynb
@@ -39,7 +39,25 @@
     "\n",
     "* Google Developers - [Google Machine Learning Education](https://developers.google.com/machine-learning)\n",
     "\n",
-    "Couses:\n",
+    "* Google Cloud - [Implement machine learning](https://cloud.google.com/architecture/framework/system-design/ai-ml)\n",
+    "\n",
+    "* Google Cloud - [Best practices for implementing machine learning on Google Cloud](https://cloud.google.com/architecture/ml-on-gcp-best-practices)\n",
+    "\n",
+    "* Google Cloud - [Data preprocessing for ML: options and recommendations](https://cloud.google.com/architecture/data-preprocessing-for-ml-with-tf-transform-pt1)\n",
+    "\n",
+    "* Google Cloud - [Data preprocessing for ML using TensorFlow Transform](https://cloud.google.com/architecture/data-preprocessing-for-ml-with-tf-transform-pt2)\n",
+    "\n",
+    "* Google Cloud - [Best practices for performance and cost optimization for machine learning](https://cloud.google.com/architecture/best-practices-for-ml-performance-cost)\n",
+    "\n",
+    "* Google Cloud - [Architecture for MLOps using TFX, Kubeflow Pipelines, and Cloud Build](https://cloud.google.com/architecture/architecture-for-mlops-using-tfx-kubeflow-pipelines-and-cloud-build)\n",
+    "\n",
+    "* Google Cloud - [Intelligent Products Essentials reference architecture](https://cloud.google.com/architecture/intelligent-products-essentials-architecture)\n",
+    "\n",
+    "* Google Cloud - [MLOps with Intelligent Products Essentials](https://cloud.google.com/architecture/mlops-intelligent-products-essentials)\n",
+    "\n",
+    "* Google Cloud - [Analyzing training-serving skew with TensorFlow Data Validation](https://cloud.google.com/architecture/ml-modeling-monitoring-analyzing-training-server-skew-in-ai-platform-prediction-with-tfdv)\n",
+    "\n",
+    "Courses:\n",
     "\n",
     "* Google Cloud Skills Boost - [Machine Learning Engineer Learning Path](https://www.cloudskillsboost.google/paths/17)\n",
     "\n",

diff --git a/mle_certificate/mle_exam.ipynb b/mle_certificate/mle_exam.ipynb
@@ -7,65 +7,110 @@
    "outputs": [],
    "source": [
     "import random\n",
-    "from questions import questions"
+    "from questions import questions\n",
+    "# from nltk.tokenize import RegexpTokenizer\n",
+    "import pandas as pd"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": 2,
    "metadata": {},
+   "outputs": [],
+   "source": [
+    "# pd.set_option('display.max_colwidth', None)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "Number of question: 76\n"
+      "Number of total questions: 221\n"
      ]
     }
    ],
    "source": [
-    "print(f'Number of question: {len(questions)}')\n",
-    "random_questions = random.sample(questions, len(questions))\n",
+    "print(f'Number of total questions: {len(questions)}')\n",
+    "questions = random.sample(questions, len(questions))\n",
+    "# questions = [q for q in questions if 'udemy' in q['tags']]\n",
+    "# print(f'Number of questions: {len(questions)}')\n",
     "i = -1\n",
     "c = False "
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# qs = [q['question'] for q in questions if len(q['question']) > 0]\n",
+    "# tokenizer = RegexpTokenizer(r'\\w+')\n",
+    "# keywords = {q: set(tokenizer.tokenize(q)) for q in qs}\n",
+    "\n",
+    "# scores = []\n",
+    "# file1 = []\n",
+    "# file2 = []\n",
+    "\n",
+    "# for i in range(len(keywords)-1):\n",
+    "#     for j in range(i+1, len(keywords)):\n",
+    "#         keyword1, keyword2 = keywords[qs[i]], keywords[qs[j]]\n",
+    "#         intersect = len(keyword1.intersection(keyword2))\n",
+    "#         min_set = min(len(keyword1), len(keyword2))\n",
+    "#         rate = round(intersect / min_set, 2)\n",
+    "        \n",
+    "#         scores.append(rate)\n",
+    "#         file1.append(qs[i])\n",
+    "#         file2.append(qs[j])\n",
+    "\n",
+    "# data = {'score': scores, 'file1': file1, 'file2': file2}\n",
+    "# df = pd.DataFrame(data).sort_values(by=['score', 'file1'], ascending=False)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
    "metadata": {},
    "outputs": [],
    "source": [
     "def get_question(i):\n",
     "    i += 1\n",
     "    i %= len(questions)\n",
-    "    question = random_questions[i]\n",
-    "    print(question['question'])\n",
+    "    question = questions[i]\n",
+    "    print(question['question'], end='\\n')\n",
     "\n",
     "    options = random.sample(list(question['options'].values()), len(question['options']))\n",
+    "    # options = question['options'].values()\n",
+    "    \n",
     "    for option in options:\n",
-    "        print(f'* {option}\\n')\n",
+    "        print(f'\\n* {option}')\n",
     "\n",
     "    return i, False"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 6,
    "metadata": {},
    "outputs": [],
    "source": [
     "def get_answers(i, c):\n",
     "    if c is False:\n",
-    "        question = random_questions[i]\n",
+    "        question = questions[i]\n",
     "        if(len(question['answers']) > 0):\n",
     "            answers = random.sample(question['answers'], len(question['answers']))\n",
     "\n",
     "            for letter in answers:\n",
     "                answer = question['options'][letter]\n",
     "                print(f'* {answer}\\n')   \n",
     "\n",
-    "            print(question['explanation']) \n",
+    "            if question['explanation']:\n",
+    "                print(question['explanation'] + '\\n') \n",
     "\n",
     "            for reference in question['references']:\n",
     "                print(f'* {reference}')\n",
@@ -80,23 +125,28 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 20,
+   "execution_count": 7,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "The data science team has built a DNN model to monitor and detect defective products using the images from the assembly line of an automobile manufacturing company. As a Google ML Engineer, you need to measure the performance of the ML model for the test dataset/images. Which of the following would you choose?\n",
+      "You work for a video game company. Your management came up with the idea of creating a game in which the characteristics of the characters were taken from those of the human players. You have been asked to generate not only the avatars but also various visual expressions during the game actions.\n",
+      "\n",
+      "* Recurrent Neural Network\n",
+      "\n",
+      "* Reinforcement Learning\n",
       "\n",
-      "* The TP value\n",
+      "* Convolutional Neural Network\n",
       "\n",
-      "* The AUC value\n",
+      "* Autoencoder and self-encoder\n",
       "\n",
-      "* The precision value\n",
+      "* Feedforward Neural Network\n",
       "\n",
-      "* The recall value\n",
-      "\n"
+      "* GAN Generative Adversarial Network\n",
+      "\n",
+      "* Transformers\n"
      ]
     }
    ],
@@ -106,14 +156,29 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 22,
+   "execution_count": 8,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "\n"
+      "* GAN Generative Adversarial Network\n",
+      "\n",
+      "GAN is a special class of machine learning frameworks used for the automatic generation of facial images.\n",
+      "GAN can create new characters from the provided images.\n",
+      "It is also used with photographs and can generate new photos that look authentic.\n",
+      "It is a kind of model highly specialized for this task. So, it is the best solution.\n",
+      "* Feedforward neural networks are the classic example of neural networks. In fact, they were the first and most elementary type of artificial neural network.\n",
+      "Feedforward neural networks are mainly used for supervised learning when the data, mainly numerical, to be learned is neither time-series nor sequential (such as NLP).\n",
+      "* The convolutional neural network (CNN) is a type of artificial neural network extensively used for image recognition and classification. It uses the convolutional layers, that is, the reworking of sets of pixels by running filters on the input pixels.\n",
+      "* A recurrent neural network (RNN) is a class of artificial neural networks where connections between nodes form a directed graph along a temporal sequence.\n",
+      "* A transformer is a deep learning model that can give different importance to each part of the input data.\n",
+      "* Reinforcement Learning provides a software agent that evaluates possible solutions through a progressive reward in repeated attempts. It does not need to provide labels. But it requires a lot of data and several trials, and the possibility to evaluate the validity of each attempt.\n",
+      "* Autoencoder is a neural network aimed to transform and learn with a compressed representation of raw data.\n",
+      "\n",
+      "* https://en.wikipedia.org/wiki/Generative_adversarial_network\n",
+      "* https://developer.nvidia.com/blog/photo-editing-generative-adversarial-networks-2/\n"
      ]
     }
    ],