======Machine Learning 02 : From Paper to Product====== With State Library closed to the public due to COVID-19, for this Machine Learning (ML) workshop we won't be able to use the Digital Media Lab at The Edge. Instead will be exploring examples of how a Machine Learning idea becomes a product that we can use, following projects from a research paper through to commercial software product. In the second half of the workshop, we will take a look at Google Colab - a free ML toolkit that we can test out online. The workshop is not an introduction to coding or math, but we will give a of general overview of how ML is defined, how it is developed into a product and where it is commonly used today. =====Requirements===== All we need to get started for this workshop is a Google account to access Google Colab in the second half of the workshop. If you don't have one you can quickly [[https://accounts.google.com/signup/v2/webcreateaccount?hl=en&flowName=GlifWebSignIn&flowEntry=SignUp| sign up]]. If you don't want to create a Google account, you can always just follow along with the examples. =====Outcomes===== * A basic Machine Learning background * Short history of speech synthesis * * ML speech synthesis examples: * Commerical * Open Sourc research * using Google Colab * using Spleeter (audio source separation) {{page>workshops:public:machine_learning:ideepcolor:start#background}} =====ML - From Paper to Product===== We'll be exploring a few ML ideas, but to start with lets follow [[https://www.youtube.com/watch?v=rp5LUSbdsys|"Real-Time User-Guided Image Colorization with Learned Deep Priors", by Richard Zhang, Jun-Yan Zhu, Phillip Isola, Xinyang Geng, Angela S. Lin, Tianhe Yu, Alexei A. Efros]] from paper to product. Its not a recent paper in ML terms where it seems every month brings another breakthrough, but we can follow this paper right through to its release in [[https://www.adobe.com/au/products/photoshop-elements/whats-new.html|Adobe Photoshop Elements 2020]]. ====Research Papers on arXiv.org==== [[https://arxiv.org/|arXiv.org]] is probably the worlds biggest and fastest growing collection of preprint electronic scientific papers from mathematics, physics, astronomy, electrical engineering, computer science, quantitative biology, statistics, mathematical finance and economics. All of the ML ideas we will be looking are either first published on arXiv.org, or reference papers on the site. ===Finding a Paper=== Papers on arXiv.org are moderated but not peer-reviewed, which means the speed and volume of publishing on this open-access repository is overwhelming. But to get started, lets say we are interested in re-colourisation of black and white images of our grandparents at their wedding, but we don't want to do all the work ourselves. We'd also like to interact in real-time and guide the process, so we can get the colour of grandad's suit and grandma's bouquet just right. So lets search for "real-time guided image colourisation" but we'll use the American English spelling " colorization". [[https://arxiv.org/search/?query=real-time+guided+image+colorization&searchtype=title&abstracts=show&order=-announced_date_first&size=50|Searching]] for "real-time guided image colorization" brings up our paper straight away, with handy [[https://arxiv.org/pdf/1705.02999.pdf|PDF]] link((this is obviously a contrived example of course, but the principle applies regardless)). {{workshops:prototypes:2022-23delivery-lasercutcovers:machine_learning:arxiv_search.jpg?800|}} ===Examining the Abstract=== All research papers begin with an abstract, and a well written abstract will tell us all we need to know about whether the paper is relevant for you, particularly if we are looking for working demonstration. This time we are in luck - there is a link to a [[https://richzhang.github.io/ideepcolor/|an ideepcolor demo site]] at the end of the abstract. ===Check out the Demo=== The demo for ideepcolor looks great and we've got a link at the top of the page, where ideepcolor is [[https://github.com/junyanz/interactive-deep-colorization|implemented]] on github. ====Code Implementation on github.com==== [[https://github.com/|Github.com]] is a website used by software developers to create, collaborate and share source code, and is most likely the largest repository of source code in the world. Github is named after [[https://git-scm.com/|git]], a free and open-source([[https://en.wikipedia.org/wiki/Free_and_open-source_software|FOSS]]) distributed version-control system for tracking changes in source code during software development. Git means that developers from all over the world can work on the same code and if the project is open source, build on, expand and re-purpose shared codes((if made available under an [[https://en.wikipedia.org/wiki/Open-source_software|appropriate license]])). But lets back up a bit and cover off on what source code is. ===Using the Source=== [[https://simple.wikipedia.org/wiki/Source_code| source code]] is the instructions for a computer program contained in a simple text document. For a computer to run a program, the source code either has to be ***compiled** into binary machine code by a [[https://simple.wikipedia.org/wiki/Compiler|compiler]]: *his file is executable - in this case execute just means can be read, understood and acted on by the computer, or * **interpreted** by another program, which directly executes the code ==== ==== Here is a example of source code. in this case its a simple program in the C programming language that shows on the screen "Hello, World" #include int main(void) { printf("Hello, world!\n"); return 0; } Despite the strange symbols, if you know how the C language is written, this program is **human readable**. ==== ==== Once this code is run through a compiler, we get a binary executable file - which is **machine readable**. ==== ==== But with the right tools (like a HEX editor) we can still open the file and edit it. ==== ==== Here is the binary for our "Hello World!" program. {{workshops:prototypes:2022-23delivery-lasercutcovers:machine_learning:helloworld_binary.png?800|}} ===Open Source=== Dokuwiki (the software we are using for this wiki) is open source, and developed publicly, and freely [[https://www.dokuwiki.org/|available]] on the internet. Anyone is able to grab the source code and run it, modify it or redistribute it. ==== ==== Below is and example of the **open source** code for this wiki, which is written in a language called [[https://simple.wikipedia.org/wiki/PHP|php]]. // define all DokuWiki globals here (needed within test requests but also helps to keep track) global $ACT, $INPUT, $QUERY, $ID, $REV, $DATE_AT, $IDX, $DATE, $RANGE, $HIGH, $TEXT, $PRE, $SUF, $SUM, $INFO, $JSINFO; if(isset($_SERVER['HTTP_X_DOKUWIKI_DO'])) { $ACT = trim(strtolower($_SERVER['HTTP_X_DOKUWIKI_DO'])); } elseif(!empty($_REQUEST['idx'])) { $ACT = 'index'; } elseif(isset($_REQUEST['do'])) { $ACT = $_REQUEST['do']; } else { $ACT = 'show'; } ==== ==== How did we get hold of the source code for this wiki? In this case all we did was look in the dokuwiki source found on [[https://github.com/splitbrain/dokuwiki|github]] pick bit of code at random and throw it in our wiki. So, finding the source for open software is easy. but to do the same thing with closed source program is usually difficult or impossible. Either you purchase or are given access to the code. Any other method may break all manner of licenses and laws. ====ideepcolor on Github==== For a project like ideepcolor, Github is where researchers and developers describe how they achieved their results with real, working code. There is generally an introduction, which should contain any major updates to the project, then we go through the prerequisites, setting up or getting started, installation, training and (hopefully) application. There are number of ways to demonstrate application of a model, ideepcolor has built a custom Graphical User Interface (GUI), which we demonstrated this project in our first ML workshop. More commonly demos are done with a [[https://jupyter.org/|jupyter notebook]] or [[https://colab.research.google.com/notebooks/welcome.ipynb|Google Colab]] notebook. Now, lets look at the updates at the top of the ideepcolor repository - which tell us: 10/3/2019 Update: Our technology is also now available in Adobe Photoshop Elements 2020. See this [[https://helpx.adobe.com/photoshop-elements/using/colorize-photo.html|blog]] and [[https://www.youtube.com/watch?v=tmXg4N4YlJg|video]] for more details. So it looks like this project has moved to the next stage - integrating ideepcolor into a commercial product. ======Comercialisation:Standalone or Software as a Service======= Many ML projects stay as research only, but for those that make it into commercial production, there is generally two paths: * As a component of an existing software project as illustrated by ideepcolor. * Offered as Software as a Service (SaaS), usually as as subscription We can expect a commercialised ML product to be faster, more stable and more refined than a demo on github, with a price tag to match. =====Other ML Examples===== Given the process outlined above - lets explore a few more ML ideas. Instead of starting on arxiv.org - lets jump to searching github instead, so we know the projects we find have a good chance of a functioning demo. ====Searching Github===== * real-time voice clone * Text To Speech (TTS) * Audio Source Separation (spleeter)