2015年1月

【转】Hooking Up Web Apps With Docker

At Substantial, between the tides of client work, we interleave work on internal projects to explore product ideas and experiment with technology.

Recent experimentation by the San Francisco team for Substantial Dashincluded stepping outside our comfort zone for application deployment. We normally use Chef to provision servers and Capistrano to deploy application code to those servers.

Typical challenges

  • How do we build a host server and then install the web application?
  • How do we release a new version?
  • Is a version just application code, or does it include dependencies for the host server too?
  • Is server provisioning and app deployment reproducible?

Docker containers continue to emerge as an answer to these questions. In fact, Docker supports many of the best practices defined in The Twelve-Factor App.

Put your app in a box

Containerizing an application's runtime environment is a well-known software design pattern for modular, secure systems. The pattern occurs over multiple layers:

  • packaging the runtime environment (a "version")
  • containing the file system
  • firewalling the network
  • metering utilization of network and processor

Well-known containers (from vendor-specific to vendor-agnostic) include:

Lightweight virtualization with Docker

Docker containers share the kernel of the host, while virtual machines each have their own kernels sharing the processor. VMs are comparatively slow and expensive because they are so low-level. They duplicate the operating system overhead for every running virtual machine, preventing a single process scheduler from managing everything efficiently.

Today, the Docker daemon easily runs on Linux to host containers. The container images are typically based on a full Linux distribution so that all of the standard tools and libraries are available. Tiny containers may be created using minimal, embeddable OSs such as BusyBox.

Images vs Containers

Every Docker "container" is run from an "image."

The image is a snapshot that can be moved around and reused; docker images lists the images available to run with the Docker daemon.

The container is the live runtimedocker ps lists the containers that are currently running.

The Dockerfile

The most visible part of Docker is the Dockerfile. Create it alongside your application code to construct an image of the application's complete environment. Notice in this example Dockerfile how everything the app needs to operate is built up from stock Ubuntu Linux.

Packaging the code

Many web developers are accustomed to deploying to a long-running server using tools like Capistrano. You push new code, restart the web server, and ta-da, a new version is deployed. But what happens if you need to change a system library, like downgrade a programming language version (e.g. Ruby 2.0.0) or upgrade a dependency (e.g. LibXML or ImageMagick)? Somehow you would have to coordinate infrastructure changes with application changes; as applications grow this can become very messy.

Containers solve this conundrum of server operations by defining all dependencies alongside the application code. Thanks to the build caching, changing application code does not necessarily require rebuilding everything in the container. Notice once again in this example Dockerfile that the application code is added late in the build sequence.

Runtime configuration

Configuration that remains constant across environments (dev, test, and production) can be kept with the application code. We include the standard etc/ config files right in the repo, and those files are added to the container with Dockerfile ADD statements.

Configuration that is unique to each environment or secret is passed as environment variables when the docker run command starts the container. An example run command passing environment variables is:

docker run -e BASE_URL="http://awesomeapp.com" -e SECRET_KEY=meowmeow image_tag_or_hash

Configuration of what command to run inside the container is set using the Dockerfile's CMD and/or ENTRYPOINT. If a container needs to run more than one process, then start a process manager such a supervisorto run and manage all the processes.

Hosting containers

Many container hosting options exist today. The offical Docker docs list compatible OSs and vendors. Docker-specific workflows are supported by its creator dotCloudStackDock, and more emerging companies.

Depending upon who you select to host your containers, they provide varying amounts of automation around Docker image and runtime management.

Our project's deployment testbed is the Digital Ocean Docker application. In this case, all we get is a bare-bones Ubuntu Linux host running the Docker daemon. So how can we upload our images to the host?

For the Substantial Dash project, we opted to use images as files. This is the simplest, private approach to experiment with Docker.

Distributing server applications

Installation of most open-source server applications requires a mixture of technical knowledge and pain. Docker is poised to become the de facto standard for self-contained application packages that will run practically anywhere.

An example of packaging modern server software for simplified distribution is Discourse, an open-source discussion forum (used byBoingBoing BBSEmber.js forums, and Mozilla Community.

Discourse's Docker project coordinates both a single-container, all-in-one, simple deployment and a multi-container, service-oriented, high-availability deployment.

The Future

The prospects for web application deployment continue to evolve. Docker currently lies between maturity of its core functionality (stable, secure containers) and emergence of high-level utility (like "drag-and-drop" install and scaling of web services). This experiement is just the beginning.

【转】Dockerfile Best Practices

Dockerfiles provide a simple syntax for building images. The following are a few tips and tricks to help you get the most out of Dockerfiles.

1: Use the cache

Each instruction in a Dockerfile commits the change into a new image which will then be used as the base of the next instruction. If an image exists with the same parent and instruction ( except for ADD ) docker will use the image instead of executing the instruction, i.e. the cache.

In order to effectively utilize the cache you need to keep your Dockerfiles consistent and only add the alterations at the end. All my Dockerfiles start with the same 5 lines.

FROM ubuntu
MAINTAINER Michael Crosby <michael@crosbymichael.com>

RUN echo "deb http://archive.ubuntu.com/ubuntu precise main universe" > /etc/apt/sources.list
RUN apt-get update
RUN apt-get upgrade -y

Changing MAINTAINER instruction will force docker to execute the proceeding RUN instructions to update apt instead of hitting the cache.

1. Keep common instructions at the top of the Dockerfile to utilize the cache.

2: Use tags

Unless you are experimenting with docker you should always pass the -t option to docker build so that the resulting image is tagged. A simple human readable tag will help you manage what each image was created for.

docker build -t="crosbymichael/sentry" .

2. Always pass -t to tag the resulting image.

3: EXPOSE-ing ports

Two of the core concepts of docker are repeatability and portability. Images should able to run on any host and as many times as needed. With Dockerfiles you have the ability to map the private and public ports, however, you should never map the public port in a Dockerfile. By mapping to the public port on your host you will only be able to have one instance of your dockerized app running.

# private and public mapping
EXPOSE 80:8080

# private only
EXPOSE 80

If the consumer of the image cares what public port the container maps to they will pass the -p option when running the image, otherwise, docker will automatically assign a port for the container.

3. Never map the public port in a Dockerfile.

4: CMD and ENTRYPOINT syntax

Both CMD and ENTRYPOINT are straight forward but they have a hidden, err, "feature" that can cause issues if you are not aware. Two different syntaxes are supported for these instructions.

CMD /bin/echo
# or
CMD ["/bin/echo"]

This may not look like it would be an issues but the devil in the details will trip you up. If you use the second syntax where the CMD ( or ENTRYPOINT ) is an array, it acts exactly like you would expect. If you use the first syntax without the array, docker pre-pends /bin/sh -c to your command. This has always been in docker as far as I can remember.

Pre-pending /bin/sh -c can cause some unexpected issues and functionality that is not easily understood if you did not know that docker modified your CMD. Therefore, you should always use the array syntax for both instructions because both will be executed exactly how you intended.

4. Always use the array syntax when using CMD and ENTRYPOINT.

5. CMD and ENTRYPOINT better together

In case you don't know ENTRYPOINT makes your dockerized application behave like a binary. You can pass arguments to the ENTRYPOINT during docker run and not worry about it being overwritten ( unlike CMD ). ENTRYPOINT is even better when used with CMD. Let's checkout my Rethinkdb Dockerfile and see how to use this.

# Dockerfile for Rethinkdb
# http://www.rethinkdb.com/

FROM ubuntu

MAINTAINER Michael Crosby <michael@crosbymichael.com>

RUN echo "deb http://archive.ubuntu.com/ubuntu precise main universe" > /etc/apt/sources.list
RUN apt-get update
RUN apt-get upgrade -y

RUN apt-get install -y python-software-properties
RUN add-apt-repository ppa:rethinkdb/ppa
RUN apt-get update
RUN apt-get install -y rethinkdb

# Rethinkdb process
EXPOSE 28015
# Rethinkdb admin console
EXPOSE 8080

# Create the /rethinkdb_data dir structure
RUN /usr/bin/rethinkdb create

ENTRYPOINT ["/usr/bin/rethinkdb"]

CMD ["--help"]

This is everything that is required to get Rethinkdb dockerized. We have my standard 5 lines at the top to make sure the base image is updated, ports exposed, etc... With the ENTRYPOINT set, we know that whenever this image is run, all arguments passed during docker run will be arguments to the ENTRYPOINT ( /usr/bin/rethinkdb ).

I also have a default CMD set in the Dockerfile to --help. What this does is incase no arguments are passed during docker run, rethinkdb's default help output will display to the user. This is same functionality that you would expect interacting with the rethinkdb binary.

docker run crosbymichael/rethinkdb


Output

Running 'rethinkdb' will create a new data directory or use an existing one,
  and serve as a RethinkDB cluster node.
File path options:
  -d [ --directory ] path           specify directory to store data and metadata
  --io-threads n                    how many simultaneous I/O operations can happen
                                    at the same time

Machine name options:
  -n [ --machine-name ] arg         the name for this machine (as will appear in
                                    the metadata).  If not specified, it will be
                                    randomly chosen from a short list of names.

Network options:
  --bind {all | addr}               add the address of a localinterface to listen
                                    on when accepting connections; loopback
                                    addresses are enabled by default
  --cluster-port port               port for receiving connections from other nodes
  --driver-port port                port for rethinkdb protocol client drivers
  -o [ --port-offset ] offset       all ports used locally will have this value
                                    added
  -j [ --join ] host:port           host and port of a rethinkdb node to connect to
  .................

Now lets run the container with the --bind all argument.

docker run crosbymichael/rethinkdb --bind all


Output

info: Running rethinkdb 1.7.1-0ubuntu1~precise (GCC 4.6.3)...
info: Running on Linux 3.2.0-45-virtual x86_64
info: Loading data from directory /rethinkdb_data
warn: Could not turn off filesystem caching for database file: "/rethinkdb_data/metadata" (Is the file located on a filesystem that doesn't support direct I/O (e.g. some encrypted or journaled file systems)?) This can cause performance problems.
warn: Could not turn off filesystem caching for database file: "/rethinkdb_data/auth_metadata" (Is the file located on a filesystem that doesn't support direct I/O (e.g. some encrypted or journaled file systems)?) This can cause performance problems.
info: Listening for intracluster connections on port 29015
info: Listening for client driver connections on port 28015
info: Listening for administrative HTTP connections on port 8080
info: Listening on addresses: 127.0.0.1, 172.16.42.13
info: Server ready
info: Someone asked for the nonwhitelisted file /js/handlebars.runtime-1.0.0.beta.6.js, if this should be accessible add it to the whitelist.

And there it is, a full Rethinkdb instance running with access to the db and admin console by, interacting with the image the same way you interact with the binary. Very powerful and yet extremely simple. I love simple.

5. ENTRYPOINT and CMD are better together.


I hope this post helps you to get started working with Dockerfiles and building images that we all can use and benefit from. Going forward, I believe that Dockerfiles will be a very important part of what makes docker so simple and easy to use whether you are consuming or producing images. I plan to invest much of my time to provide a complete, powerful, yet simple solution to building docker images via the Dockerfile.

最新文章

最近回复

  • feifei435:这两个URI实际是不一样的
  • zsy: git push origin 分支 -f 给力!
  • 冼敏兵:简单易懂,good fit
  • Jack:无需改配置文件,看着累! # gluster volume se...
  • Mr.j:按照你的方法凑效了,折腾死了。。。。
  • zheyemaster:补充一句:我的网站路径:D:\wamp\www ~~菜鸟站长, ...
  • zheyemaster:wamp2.5(apache2.4.9)下局域网访问403错误的...
  • Git中pull对比fetch和merge | 炼似春秋:[…] 首先,我搜索了git pull和git fe...
  • higkoo:总结一下吧, 性能调优示例: gluster volume s...
  • knowaeap:请问一下博主,你维护的openyoudao支持opensuse吗

分类

归档

其它