# replicate

**Repository Path**: mirrors_github/replicate

## Basic Information

- **Project Name**: replicate
- **Description**: Dump and load relational objects between Ruby environments.
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2020-08-08
- **Last Updated**: 2026-02-14

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

Dump and load relational objects between Ruby environments.
===========================================================

The project started at GitHub to simplify the process of getting real production
data into development and staging environments. We use it to replicate entire
repository data (including associated issue, pull request, commit comment, etc.
records) from production to our development environments with a single command.
It's excessively useful for troubleshooting issues, support requests, and
exception reports as well as for establishing real data for evaluating design
concepts.

Synopsis
--------

### Installing

    $ gem install replicate

### Dumping objects

Evaluate a Ruby expression, dumping all resulting objects to standard output:

    $ replicate -r ./config/environment -d "User.find(1)" > user.dump
    ==> dumped 4 total objects:
    Profile        1
    User           1
    UserEmail      2

The `-r ./config/environment` option is used to require environment setup and
model instantiation code needed by the ruby expression.

### Dumping many objects with a dump script

Dump scripts are normal ruby source files evaluated in the context of the
dumper. The `dump(object)` method is used to put objects into the dump stream.

```ruby
# config/replicate/dump-stuff.rb
require 'config/environment'

%w[rtomayko/tilt rtomayko/bcat].each do |repo_name|
  repo = Repository.find_by_name_with_owner(repo_name)
  dump repo
  dump repo.commit_comments
  dump repo.issues
end
```

Run the dump script:

    $ replicate -d config/replicate/dump-stuff.rb > repos.dump
    ==> dumped 1479 total objects:
    AR::Habtm                   101
    CommitComment                95
    Issue                       101
    IssueComment                427
    IssueEvent                  308
    Label                         5
    Language                     19
    LanguageName                  1
    Milestone                     3
    Organization                  4
    Profile                      82
    PullRequest                  44
    PullRequestReviewComment      8
    Repository                   20
    Team                          4
    TeamMember                    6
    User                         89
    UserEmail                   162

### Loading many objects:

    $ replicate -r ./config/environment -l < repos.dump
    ==> loaded 1479 total objects:
    AR::Habtm                   101
    CommitComment                95
    Issue                       101
    IssueComment                427
    IssueEvent                  308
    Label                         5
    Language                     19
    LanguageName                  1
    Milestone                     3
    Organization                  4
    Profile                      82
    PullRequest                  44
    PullRequestReviewComment      8
    Repository                   20
    Team                          4
    TeamMember                    6
    User                         89
    UserEmail                   162

### Dumping and loading over ssh

    $ remote_command="replicate -r /app/config/environment -d 'User.find(1234)'"
    $ ssh example.org "$remote_command" |replicate -r ./config/environment -l

ActiveRecord
------------

Basic support for dumping and loading ActiveRecord objects is included. The
tests pass under ActiveRecord versions 2.2.3, 2.3.5, 2.3.14, 3.0.10, 3.1.0, and 3.2.0 under
MRI 1.8.7 as well as under MRI 1.9.2.

To use customization macros in your models, require the replicate library after
ActiveRecord (in e.g., `config/initializers/libraries.rb`):

```ruby
require 'active_record'
require 'replicate'
```

ActiveRecord support works sensibly without customization so this isn't strictly
necessary to use the `replicate` command. The following sections document the
available customization macros.

### Association Dumping

The baked in support adds some more or less sensible default behavior for all
subclasses of `ActiveRecord::Base` such that dumping an object will bring in
objects related via `belongs_to` and `has_one` associations.

Unlike 1:1 associations, `has_many` and `has_and_belongs_to_many` associations
are not automatically included. Doing so would quickly lead to the entire
database being sucked in. It can be useful to mark specific associations for
automatic inclusion using the `replicate_associations` macro. For instance,
to always include `EmailAddress` records belonging to a `User`:

```ruby
class User < ActiveRecord::Base
  belongs_to :profile
  has_many   :email_addresses

  replicate_associations :email_addresses
end
```

You may also do this by passing an option in your dump script:

```ruby
dump User.all, :associations => [:email_addresses]
```

### Natural Keys

By default, the loader attempts to create a new record with a new primary key id
for all objects. This can lead to unique constraint errors when a record already
exists with matching attributes. To update existing records instead of
creating new ones, define a natural key for the model using the `replicate_natural_key`
macro:

```ruby
class User < ActiveRecord::Base
  belongs_to :profile
  has_many   :email_addresses

  replicate_natural_key :login
  replicate_associations :email_addresses
end

class EmailAddress < ActiveRecord::Base
  belongs_to :user
  replicate_natural_key :user_id, :email
end
```

Multiple attribute names may be specified to define a compound key. Foreign key
column attributes (`user_id`) are often included in natural keys.

### Omission of attributes and associations

You might want to exclude some attributes or associations from being dumped. For
this, use the replicate_omit_attributes macro:

```ruby
class User < ActiveRecord::Base
  has_one    :profile

  replicate_omit_attributes :created_at, :profile
end
```

You can omit belongs_to associations by omitting the foreign key column.

You may also do this by passing an option in your dump script:

```ruby
dump User.all, :omit => [:profile]
```

### Validations and Callbacks

__IMPORTANT:__ All ActiveRecord validations and callbacks are disabled on the
loading side. While replicate piggybacks on AR for relationship information and
uses `ActiveRecord::Base#save` to write objects to the database, it's designed
to act as a simple dump / load tool.

It's sometimes useful to run certain types of callbacks on replicate. For
instance, you might want to create files on disk or load information into a
separate data store any time an object enters the database. The best way to go
about this currently is to override the model's `load_replicant` class method:

```ruby
class User < ActiveRecord::Base
  def self.load_replicant(type, id, attrs)
    id, object = super
    object.register_in_redis
    object.some_other_callback
    [id, object]
  end
end
```

This interface will be improved in future versions.

Custom Objects
--------------

Other object types may be included in the dump stream so long as they implement
the `dump_replicant` and `load_replicant` methods.

### dump_replicant

The dump side calls `#dump_replicant(dumper, opts={})` on each object. The method must
call `dumper.write()` with the class name, id, and hash of primitively typed
attributes for the object:

```ruby
class User
  attr_reader   :id
  attr_accessor :name, :email

  def dump_replicant(dumper, opts={})
    attributes = { 'name' => name, 'email' => email }
    dumper.write self.class, id, attributes, self
  end
end
```

### load_replicant

The load side calls `::load_replicant(type, id, attributes)` on the class to
load each object into the current environment. The method must return an
`[id, object]` tuple:

```ruby
class User
  def self.load_replicant(type, id, attributes)
    user = User.new
    user.name  = attributes['name']
    user.email = attributes['email']
    user.save!
    [user.id, user]
  end
end
```

How it works
------------

The dump format is designed for streaming relational data. Each object is
encoded as a `[type, id, attributes]` tuple and marshalled directly onto the
stream. The `type` (class name string) and `id` must form a distinct key when
combined, `attributes` must consist of only string keys and simply typed values.

Relationships between objects in the stream are managed as follows:

 - An object's attributes may encode references to objects that precede it
   in the stream using a simple tuple format: `[:id, 'User', 1234]`.

 - The dump side ensures that objects are written to the dump stream in
   "reference order" such that when an object A includes a reference attribute
   to an object B, B is guaranteed to arrive before A.

 - The load side maintains a mapping of ids from the dumping system to the newly
   replicated objects on the loading system. When the loader encounters a
   reference value `[:id, 'User', 1234]` in an object's attributes, it converts it
   to the load side id value.

Dumping and loading happens in a streaming fashion. There is no limit on the
number of objects included in the stream.