Machine Learning and Data Analysis Coordination

Machine Learning and Data Analysis Coordination

The top priority of this project is free COVID-19 testing; see the homepage for general details about the project. We are seeking machine learning teams from around the world to actively collaborate with us. Together, we can make this happen.

Multiple Teams Concept

Based on the collection list, it becomes clear that different machine learning and data analysis teams can tackle different algorithms for different insights. Collectively we call these teams “ML teams” although it doesn’t have to be machine learning; in many cases, even procedural experts can assist. For the initial group we anticipate at least 10 ML teams globally, visionaries and risk takers who are willing to try to help the world against the odds. By the time progress begins to become apparent and we approach the tail end of the project (which under normal circumstances would take a very long time), we hope to see up to 100 ML teams. Each success along the way will reduce the risk of failure. As the project becomes clearly feasible but still requires further refinements, we anticipate an explosion of interest as more teams join to race against the clock. For this reason, the initial 10 ML teams should focus more on demonstrating feasibility than on perfecting reliability.

To pave the way for ML teams to do what they do best, Pleasant Solutions will entirely manage the application, data collection, core hosting, and machine learning coordination platforms. As time progresses, we will aim to iterate to make things increasingly organized, but we won’t wait for a perfect situation to get started. Every day counts.

Pleasant Solutions will participate in the machine learning and data analysis, but simply as one of the many teams. We simply do not have capacity to tackle this alone; frankly, no team does.

Benefits of Participating

The top priority of this project is free COVID-19 testing. We acknowledge that ML Teams may have other objectives in participating, but here are the ones you should consider:

  • Work to help save many lives, without having to go it alone.
  • Get listed on the contributors page. This will be especially useful if we succeed.
  • Build good will with your other contacts who are participating pro bono.
  • Learn a lot by tackling these challenges, as well as by watching other teams do their part.
  • Build capacity to launch future products based on the specific insights that your team discovers, including patenting your portion (see legals below).
  • Prize money

Data Derivatives

Based on the core data collected, every team will be able to create a “derivative” which is either a view of the data (e.g., the audio sample but with normalized volume), or metadata such as “original DB volume of sample”.

Derivatives can chain to depend on each other, within the same ML team or across different ML teams.

Infrastructure

Our coordination infrastructure is designed to allow each team to stick to whatever they already know best. No learning new tools nor learning new software. As time allows, we may be able to provide all teams with tools or access to services, but initially we ask that teams “come as they are” so we can get started immediately.

  • The data (anticipated to be many terabytes) will be stored in a central cloud repository, and all derivatives will be centralized as well.
  • Each ML Team will be given a hopefully powerful virtual machine (or multiple) with fast bandwidth and unlimited access to the data set via a very simple API to fetch a record and put back entries to their derivatives.
    • Teams can install the tools they are familiar with (from Matlab to code generators to pretty much anything)
    • The virtual machines are not intended as “development machines”; teams will build, design, compile, explore outside of the VM and use the VM to apply a draft against the data.
  • Each ML Team will be given a smaller sample of audio (perhaps 500) to download to remote computers for localized preliminary testing and iteration. Downloading the entire dataset outside of the VM will not be permitted.
  • There will be a central catalogue of data and derivatives across all ML Teams.
    • Each ML Team can create a derivative, which has:
      • A name
      • Status (such as draft, usable but still iterating, reliable, or deprecated/abandoned)
      • Description
      • Contact information
    • Each ML Team will only be able to change the data of the derivative columns they created; however, they can leverage other teams’ columns.
  • We will have a multiple chat rooms for collaboration topics:
    • Technical support
    • Idea brainstorming
    • Insight announcements

Legals

We are still exploring this, but here is our draft approach:

  • ML Teams will retain ownership to the process and method used to determine a given derivative, but you must provide a perpetual, global, free license to Pleasant and any Pleasant clients for unlimited use towards COVID-19, Coronavirus, or any future pandemic (as declared by the WHO) and provide Pleasant full code sufficient to execute it on demand. You may otherwise patent your approach and profit from it in the future, so long as the scope of the patent does not leverage or include insights developed by other ML Teams.
  • ML Teams will retain ownership of the chain of derivatives they specifically invent, but they must provide a perpetual, global, free license to Pleasant and any Pleasant clients for unlimited use towards COVID-19, Coronavirus, or any future pandemic (as declared by the WHO). You may otherwise patent your chain and profit from it in the future, so long as the scope of the patent does not leverage or include the insights that other ML Teams develop.
  • Pleasant retains exclusive ownership of the platform and innovation surrounding listed derivatives, but is providing free license for COVID-19 purposes.
  • Pleasant and/or volunteers own the collected data, but are providing license for COVID-19 purposes. A portion of data will allow offline (outside the central system) usage, the bulk would be restricted to usage from within the VMs.
  • Joining as an ML Team will require:
    • Some indication of reputation or ongoing corporate presence.
    • Commitment of minimum participation (so we aren’t wasting our time onboarding).
    • E-signed agreement of the legal terms.
    • Potentially NDA? (we aren’t sure at this point if it is a benefit or hindrance)

First Steps

Contact us at support@caredemic.com

To save time, please include the following up-front.

  1. Your name, company name, what country, languages you speak, email and phone.
  2. The number of employees within your company / team (even those that can’t participate).
  3. How many people you would be willing to donate and how many hours of each, for the first month and second month.
  4. When you would be able to start.
  5. Areas of expertise (be specific, not just data analysis, procedural, ML, but what kind of ML is your specialty).
  6. Whether or not you have any concerns about the legals.
  7. Any special requirements.

References

Not sure about how possible this is? Read the homepage for our perspective as well as the References Page.