285 lines
11 KiB
Plaintext
285 lines
11 KiB
Plaintext
git-fast-export(1)
|
|
==================
|
|
|
|
NAME
|
|
----
|
|
git-fast-export - Git data exporter
|
|
|
|
|
|
SYNOPSIS
|
|
--------
|
|
[verse]
|
|
'git fast-export' [<options>] | 'git fast-import'
|
|
|
|
DESCRIPTION
|
|
-----------
|
|
This program dumps the given revisions in a form suitable to be piped
|
|
into 'git fast-import'.
|
|
|
|
You can use it as a human-readable bundle replacement (see
|
|
linkgit:git-bundle[1]), or as a format that can be edited before being
|
|
fed to 'git fast-import' in order to do history rewrites (an ability
|
|
relied on by tools like 'git filter-repo').
|
|
|
|
OPTIONS
|
|
-------
|
|
--progress=<n>::
|
|
Insert 'progress' statements every <n> objects, to be shown by
|
|
'git fast-import' during import.
|
|
|
|
--signed-tags=(verbatim|warn|warn-strip|strip|abort)::
|
|
Specify how to handle signed tags. Since any transformation
|
|
after the export can change the tag names (which can also happen
|
|
when excluding revisions) the signatures will not match.
|
|
+
|
|
When asking to 'abort' (which is the default), this program will die
|
|
when encountering a signed tag. With 'strip', the tags will silently
|
|
be made unsigned, with 'warn-strip' they will be made unsigned but a
|
|
warning will be displayed, with 'verbatim', they will be silently
|
|
exported and with 'warn', they will be exported, but you will see a
|
|
warning.
|
|
|
|
--tag-of-filtered-object=(abort|drop|rewrite)::
|
|
Specify how to handle tags whose tagged object is filtered out.
|
|
Since revisions and files to export can be limited by path,
|
|
tagged objects may be filtered completely.
|
|
+
|
|
When asking to 'abort' (which is the default), this program will die
|
|
when encountering such a tag. With 'drop' it will omit such tags from
|
|
the output. With 'rewrite', if the tagged object is a commit, it will
|
|
rewrite the tag to tag an ancestor commit (via parent rewriting; see
|
|
linkgit:git-rev-list[1]).
|
|
|
|
-M::
|
|
-C::
|
|
Perform move and/or copy detection, as described in the
|
|
linkgit:git-diff[1] manual page, and use it to generate
|
|
rename and copy commands in the output dump.
|
|
+
|
|
Note that earlier versions of this command did not complain and
|
|
produced incorrect results if you gave these options.
|
|
|
|
--export-marks=<file>::
|
|
Dumps the internal marks table to <file> when complete.
|
|
Marks are written one per line as `:markid SHA-1`. Only marks
|
|
for revisions are dumped; marks for blobs are ignored.
|
|
Backends can use this file to validate imports after they
|
|
have been completed, or to save the marks table across
|
|
incremental runs. As <file> is only opened and truncated
|
|
at completion, the same path can also be safely given to
|
|
--import-marks.
|
|
The file will not be written if no new object has been
|
|
marked/exported.
|
|
|
|
--import-marks=<file>::
|
|
Before processing any input, load the marks specified in
|
|
<file>. The input file must exist, must be readable, and
|
|
must use the same format as produced by --export-marks.
|
|
|
|
--mark-tags::
|
|
In addition to labelling blobs and commits with mark ids, also
|
|
label tags. This is useful in conjunction with
|
|
`--export-marks` and `--import-marks`, and is also useful (and
|
|
necessary) for exporting of nested tags. It does not hurt
|
|
other cases and would be the default, but many fast-import
|
|
frontends are not prepared to accept tags with mark
|
|
identifiers.
|
|
+
|
|
Any commits (or tags) that have already been marked will not be
|
|
exported again. If the backend uses a similar --import-marks file,
|
|
this allows for incremental bidirectional exporting of the repository
|
|
by keeping the marks the same across runs.
|
|
|
|
--fake-missing-tagger::
|
|
Some old repositories have tags without a tagger. The
|
|
fast-import protocol was pretty strict about that, and did not
|
|
allow that. So fake a tagger to be able to fast-import the
|
|
output.
|
|
|
|
--use-done-feature::
|
|
Start the stream with a 'feature done' stanza, and terminate
|
|
it with a 'done' command.
|
|
|
|
--no-data::
|
|
Skip output of blob objects and instead refer to blobs via
|
|
their original SHA-1 hash. This is useful when rewriting the
|
|
directory structure or history of a repository without
|
|
touching the contents of individual files. Note that the
|
|
resulting stream can only be used by a repository which
|
|
already contains the necessary objects.
|
|
|
|
--full-tree::
|
|
This option will cause fast-export to issue a "deleteall"
|
|
directive for each commit followed by a full list of all files
|
|
in the commit (as opposed to just listing the files which are
|
|
different from the commit's first parent).
|
|
|
|
--anonymize::
|
|
Anonymize the contents of the repository while still retaining
|
|
the shape of the history and stored tree. See the section on
|
|
`ANONYMIZING` below.
|
|
|
|
--anonymize-map=<from>[:<to>]::
|
|
Convert token `<from>` to `<to>` in the anonymized output. If
|
|
`<to>` is omitted, map `<from>` to itself (i.e., do not
|
|
anonymize it). See the section on `ANONYMIZING` below.
|
|
|
|
--reference-excluded-parents::
|
|
By default, running a command such as `git fast-export
|
|
master~5..master` will not include the commit master{tilde}5
|
|
and will make master{tilde}4 no longer have master{tilde}5 as
|
|
a parent (though both the old master{tilde}4 and new
|
|
master{tilde}4 will have all the same files). Use
|
|
--reference-excluded-parents to instead have the stream
|
|
refer to commits in the excluded range of history by their
|
|
sha1sum. Note that the resulting stream can only be used by a
|
|
repository which already contains the necessary parent
|
|
commits.
|
|
|
|
--show-original-ids::
|
|
Add an extra directive to the output for commits and blobs,
|
|
`original-oid <SHA1SUM>`. While such directives will likely be
|
|
ignored by importers such as git-fast-import, it may be useful
|
|
for intermediary filters (e.g. for rewriting commit messages
|
|
which refer to older commits, or for stripping blobs by id).
|
|
|
|
--reencode=(yes|no|abort)::
|
|
Specify how to handle `encoding` header in commit objects. When
|
|
asking to 'abort' (which is the default), this program will die
|
|
when encountering such a commit object. With 'yes', the commit
|
|
message will be re-encoded into UTF-8. With 'no', the original
|
|
encoding will be preserved.
|
|
|
|
--refspec::
|
|
Apply the specified refspec to each ref exported. Multiple of them can
|
|
be specified.
|
|
|
|
[<git-rev-list-args>...]::
|
|
A list of arguments, acceptable to 'git rev-parse' and
|
|
'git rev-list', that specifies the specific objects and references
|
|
to export. For example, `master~10..master` causes the
|
|
current master reference to be exported along with all objects
|
|
added since its 10th ancestor commit and (unless the
|
|
--reference-excluded-parents option is specified) all files
|
|
common to master{tilde}9 and master{tilde}10.
|
|
|
|
EXAMPLES
|
|
--------
|
|
|
|
-------------------------------------------------------------------
|
|
$ git fast-export --all | (cd /empty/repository && git fast-import)
|
|
-------------------------------------------------------------------
|
|
|
|
This will export the whole repository and import it into the existing
|
|
empty repository. Except for reencoding commits that are not in
|
|
UTF-8, it would be a one-to-one mirror.
|
|
|
|
-----------------------------------------------------
|
|
$ git fast-export master~5..master |
|
|
sed "s|refs/heads/master|refs/heads/other|" |
|
|
git fast-import
|
|
-----------------------------------------------------
|
|
|
|
This makes a new branch called 'other' from 'master~5..master'
|
|
(i.e. if 'master' has linear history, it will take the last 5 commits).
|
|
|
|
Note that this assumes that none of the blobs and commit messages
|
|
referenced by that revision range contains the string
|
|
'refs/heads/master'.
|
|
|
|
|
|
ANONYMIZING
|
|
-----------
|
|
|
|
If the `--anonymize` option is given, git will attempt to remove all
|
|
identifying information from the repository while still retaining enough
|
|
of the original tree and history patterns to reproduce some bugs. The
|
|
goal is that a git bug which is found on a private repository will
|
|
persist in the anonymized repository, and the latter can be shared with
|
|
git developers to help solve the bug.
|
|
|
|
With this option, git will replace all refnames, paths, blob contents,
|
|
commit and tag messages, names, and email addresses in the output with
|
|
anonymized data. Two instances of the same string will be replaced
|
|
equivalently (e.g., two commits with the same author will have the same
|
|
anonymized author in the output, but bear no resemblance to the original
|
|
author string). The relationship between commits, branches, and tags is
|
|
retained, as well as the commit timestamps (but the commit messages and
|
|
refnames bear no resemblance to the originals). The relative makeup of
|
|
the tree is retained (e.g., if you have a root tree with 10 files and 3
|
|
trees, so will the output), but their names and the contents of the
|
|
files will be replaced.
|
|
|
|
If you think you have found a git bug, you can start by exporting an
|
|
anonymized stream of the whole repository:
|
|
|
|
---------------------------------------------------
|
|
$ git fast-export --anonymize --all >anon-stream
|
|
---------------------------------------------------
|
|
|
|
Then confirm that the bug persists in a repository created from that
|
|
stream (many bugs will not, as they really do depend on the exact
|
|
repository contents):
|
|
|
|
---------------------------------------------------
|
|
$ git init anon-repo
|
|
$ cd anon-repo
|
|
$ git fast-import <../anon-stream
|
|
$ ... test your bug ...
|
|
---------------------------------------------------
|
|
|
|
If the anonymized repository shows the bug, it may be worth sharing
|
|
`anon-stream` along with a regular bug report. Note that the anonymized
|
|
stream compresses very well, so gzipping it is encouraged. If you want
|
|
to examine the stream to see that it does not contain any private data,
|
|
you can peruse it directly before sending. You may also want to try:
|
|
|
|
---------------------------------------------------
|
|
$ perl -pe 's/\d+/X/g' <anon-stream | sort -u | less
|
|
---------------------------------------------------
|
|
|
|
which shows all of the unique lines (with numbers converted to "X", to
|
|
collapse "User 0", "User 1", etc into "User X"). This produces a much
|
|
smaller output, and it is usually easy to quickly confirm that there is
|
|
no private data in the stream.
|
|
|
|
Reproducing some bugs may require referencing particular commits or
|
|
paths, which becomes challenging after refnames and paths have been
|
|
anonymized. You can ask for a particular token to be left as-is or
|
|
mapped to a new value. For example, if you have a bug which reproduces
|
|
with `git rev-list sensitive -- secret.c`, you can run:
|
|
|
|
---------------------------------------------------
|
|
$ git fast-export --anonymize --all \
|
|
--anonymize-map=sensitive:foo \
|
|
--anonymize-map=secret.c:bar.c \
|
|
>stream
|
|
---------------------------------------------------
|
|
|
|
After importing the stream, you can then run `git rev-list foo -- bar.c`
|
|
in the anonymized repository.
|
|
|
|
Note that paths and refnames are split into tokens at slash boundaries.
|
|
The command above would anonymize `subdir/secret.c` as something like
|
|
`path123/bar.c`; you could then search for `bar.c` in the anonymized
|
|
repository to determine the final pathname.
|
|
|
|
To make referencing the final pathname simpler, you can map each path
|
|
component; so if you also anonymize `subdir` to `publicdir`, then the
|
|
final pathname would be `publicdir/bar.c`.
|
|
|
|
LIMITATIONS
|
|
-----------
|
|
|
|
Since 'git fast-import' cannot tag trees, you will not be
|
|
able to export the linux.git repository completely, as it contains
|
|
a tag referencing a tree instead of a commit.
|
|
|
|
SEE ALSO
|
|
--------
|
|
linkgit:git-fast-import[1]
|
|
|
|
GIT
|
|
---
|
|
Part of the linkgit:git[1] suite
|