[edk2] [edk2-announce] Research Request

Laszlo Ersek lersek at redhat.com
Thu Dec 6 05:33:44 PST 2018


On 12/05/18 20:09, Jeremiah Cox wrote:
> Hi Laszlo,
> Regarding "comprehensive backup/archival functionality that is core to the service itself", are you speaking more to GitHub's internal metadata verbosity (e.g. not losing PR details when branches and repos are deleted), GitHub's backup strategy to prevent data loss, or the ability to export all of this data from GitHub?

The last one.

Unless the service sends sufficiently comprehensive emails, so that a
human reader -- note: not writer -- can later get a full understanding
of the events related to the project, the service should provide some
other (core) functionality to keep an external archive up-to-date at all
times. The goal of both alternatives is the same -- at any point, if the
service suddenly becomes unusable, the project should be at liberty to
take its past with itself elsewhere. At that point, extracting data may
no longer be possible, which is why the archive should continuously be
refreshed, on-line.

> I believe your PR experiments are exploring the first point about metadata verbosity.

Not exactly / not only. I care about multiple topics. One is the
usability of the WebUI itself (e.g. what artifacts one can attach review
comments to). Another is the longevity of artifacts as they are
presented on the WebUI (and to local git command lines). Yet another is
how independent a project can remain / how easily it can take its past
with itself elsewhere. Others have mentioned offline reviewing of
project events (recent or not so recent).

> We've done some experimentation of our own and have found the verbosity acceptable for us.
> 
> GitHub's internal backup strategy is published:
> https://help.github.com/articles/github-security/#file-system-and-backups 
> 
> Regarding export, I discovered GitHub has a preview REST API dedicated to backup & archival.  GitHub will package up all of our metadata into a big tarball:
> https://developer.github.com/v3/migrations/orgs/ 
> At a glance it appears to be simple to use and comprehensive.

If this archive is complete (that is, if we download it on Monday, fail
to download it on Tuesday, manage to download it again on Wednesday, and
the Wed download contains all the Tues events as well), then I agree it
is comprehensive enough, because outages in the consumer component will
not cause permanent data loss -- eventually the next successful download
will fill the gap.

I'm unsure about the scope of this feature however. The page you linked
starts with:

    The organization migrations API lets you move a repository from
    GitHub to GitHub Enterprise.

That's not really what I have in mind; instead, if the above
(comprehensive) download is offered indeed, we should download it daily.
That would sort-of cover alternative #2.

Thanks
Laszlo


More information about the edk2-devel mailing list