3
======================================
4
An Introduction to boto's S3 interface
5
======================================
7
This tutorial focuses on the boto interface to the Simple Storage Service
8
from Amazon Web Services. This tutorial assumes that you have already
9
downloaded and installed boto.
13
The first step in accessing S3 is to create a connection to the service.
14
There are two ways to do this in boto. The first is:
16
>>> from boto.s3.connection import S3Connection
17
>>> conn = S3Connection('<aws access key>', '<aws secret key>')
19
At this point the variable conn will point to an S3Connection object. In
20
this example, the AWS access key and AWS secret key are passed in to the
21
method explicitely. Alternatively, you can set the environment variables:
23
AWS_ACCESS_KEY_ID - Your AWS Access Key ID
24
AWS_SECRET_ACCESS_KEY - Your AWS Secret Access Key
26
and then call the constructor without any arguments, like this:
28
>>> conn = S3Connection()
30
There is also a shortcut function in the boto package, called connect_s3
31
that may provide a slightly easier means of creating a connection:
34
>>> conn = boto.connect_s3()
36
In either case, conn will point to an S3Connection object which we will
37
use throughout the remainder of this tutorial.
42
Once you have a connection established with S3, you will probably want to
43
create a bucket. A bucket is a container used to store key/value pairs
44
in S3. A bucket can hold un unlimited about of data so you could potentially
45
have just one bucket in S3 for all of your information. Or, you could create
46
separate buckets for different types of data. You can figure all of that out
47
later, first let's just create a bucket. That can be accomplished like this:
49
>>> bucket = conn.create_bucket('mybucket')
50
Traceback (most recent call last):
51
File "<stdin>", line 1, in ?
52
File "boto/connection.py", line 285, in create_bucket
53
raise S3CreateError(response.status, response.reason)
54
boto.exception.S3CreateError: S3Error[409]: Conflict
56
Whoa. What happended there? Well, the thing you have to know about
57
buckets is that they are kind of like domain names. It's one flat name
58
space that everyone who uses S3 shares. So, someone has already create
59
a bucket called "mybucket" in S3 and that means no one else can grab that
60
bucket name. So, you have to come up with a name that hasn't been taken yet.
61
For example, something that uses a unique string as a prefix. Your
62
AWS_ACCESS_KEY (NOT YOUR SECRET KEY!) could work but I'll leave it to
63
your imagination to come up with something. I'll just assume that you
64
found an acceptable name.
66
The create_bucket method will create the requested bucket if it does not
67
exist or will return the existing bucket if it does exist.
72
Once you have a bucket, presumably you will want to store some data
73
in it. S3 doesn't care what kind of information you store in your objects
74
or what format you use to store it. All you need is a key that is unique
77
The Key object is used in boto to keep track of data stored in S3. To store
78
new data in S3, start by creating a new Key object:
80
>>> from boto.s3.key import Key
83
>>> k.set_contents_from_string('This is a test of S3')
85
The net effect of these statements is to create a new object in S3 with a
86
key of "foobar" and a value of "This is a test of S3". To validate that
87
this worked, quit out of the interpreter and start it up again. Then:
90
>>> c = boto.connect_s3()
91
>>> b = c.create_bucket('mybucket') # substitute your bucket name here
92
>>> from boto.s3.key import Key
95
>>> k.get_contents_as_string()
96
'This is a test of S3'
98
So, we can definitely store and retrieve strings. A more interesting
99
example may be to store the contents of a local file in S3 and then retrieve
100
the contents to another local file.
104
>>> k.set_contents_from_filename('foo.jpg')
105
>>> k.get_contents_to_filename('bar.jpg')
107
There are a couple of things to note about this. When you send data to
108
S3 from a file or filename, boto will attempt to determine the correct
109
mime type for that file and send it as a Content-Type header. The boto
110
package uses the standard mimetypes package in Python to do the mime type
111
guessing. The other thing to note is that boto does stream the content
112
to and from S3 so you should be able to send and receive large files without
115
Listing All Available Buckets
116
-----------------------------
117
In addition to accessing specific buckets via the create_bucket method
118
you can also get a list of all available buckets that you have created.
120
>>> rs = conn.get_all_buckets()
122
This returns a ResultSet object (see the SQS Tutorial for more info on
123
ResultSet objects). The ResultSet can be used as a sequence or list type
124
object to retrieve Bucket objects.
131
<listing of available buckets>
134
Setting / Getting the Access Control List for Buckets and Keys
135
--------------------------------------------------------------
136
The S3 service provides the ability to control access to buckets and keys
137
within s3 via the Access Control List (ACL) associated with each object in
138
S3. There are two ways to set the ACL for an object:
140
1. Create a custom ACL that grants specific rights to specific users. At the
141
moment, the users that are specified within grants have to be registered
142
users of Amazon Web Services so this isn't as useful or as general as it
145
2. Use a "canned" access control policy. There are four canned policies
147
a. private: Owner gets FULL_CONTROL. No one else has any access rights.
148
b. public-read: Owners gets FULL_CONTROL and the anonymous principal is granted READ access.
149
c. public-read-write: Owner gets FULL_CONTROL and the anonymous principal is granted READ and WRITE access.
150
d. authenticated-read: Owner gets FULL_CONTROL and any principal authenticated as a registered Amazon S3 user is granted READ access.
152
Currently, boto only supports the second method using canned access control
153
policies. A future version may allow setting of arbitrary ACL's if there
154
is sufficient demand.
156
To set the ACL for a bucket, use the set_acl method of the Bucket object.
157
The argument passed to this method must be one of the four permissable
158
canned policies named in the list CannedACLStrings contained in acl.py.
159
For example, to make a bucket readable by anyone:
161
>>> b.set_acl('public-read')
163
You can also set the ACL for Key objects, either by passing an additional
164
argument to the above method:
166
>>> b.set_acl('public-read', 'foobar')
168
where 'foobar' is the key of some object within the bucket b or you can
169
call the set_acl method of the Key object:
171
>>> k.set_acl('public-read')
173
You can also retrieve the current ACL for a Bucket or Key object using the
174
get_acl object. This method parses the AccessControlPolicy response sent
175
by S3 and creates a set of Python objects that represent the ACL.
177
>>> acp = b.get_acl()
179
<boto.acl.Policy instance at 0x2e6940>
181
<boto.acl.ACL instance at 0x2e69e0>
183
[<boto.acl.Grant instance at 0x2e6a08>]
184
>>> for grant in acp.acl.grants:
185
... print grant.permission, grant.grantee
187
FULL_CONTROL <boto.user.User instance at 0x2e6a30>
189
The Python objects representing the ACL can be found in the acl.py module
192
Setting/Getting Metadata Values on Key Objects
193
----------------------------------------------
194
S3 allows arbitrary user metadata to be assigned to objects within a bucket.
195
To take advantage of this S3 feature, you should use the set_metadata and
196
get_metadata methods of the Key object to set and retrieve metadata associated
197
with an S3 object. For example:
200
>>> k.key = 'has_metadata'
201
>>> k.set_metadata('meta1', 'This is the first metadata value')
202
>>> k.set_metadata('meta2', 'This is the second metadata value')
203
>>> k.set_contents_from_filename('foo.txt')
205
This code associates two metadata key/value pairs with the Key k. To retrieve
208
>>> k = b.get_key('has_metadata)
209
>>> k.get_metadata('meta1')
210
'This is the first metadata value'
211
>>> k.get_metadata('meta2')
212
'This is the second metadata value'