CodeArtifact: An Internal Package Repository for Builds

K
Kai··4 min read

The build in Part II pulls pytest from the Internet every time it runs. With a real app, a single build might download hundreds of packages from PyPI or npm — and each external source is a risk: a package gets yanked, the registry goes down, or a version changes behind your back. CodeArtifact solves that: an AWS-managed package repository that is both a proxy cache for public registries (builds pull through it) and a store for your private packages. This article (which is also all of Part III) creates the repository, publishes a package, reinstalls it from the repository, and wires it into CodeBuild.

Goal

Create a CodeArtifact domain and repository, understand the external connection mechanism (proxying a public registry) and the auth token, publish then consume a package, and have CodeBuild pull packages from CodeArtifact.

Domain and repository

CodeArtifact has two layers. A domain is the high-level container, grouping repositories and managing shared encryption/permissions. A repository is where packages actually live. Create both:

$ aws codeartifact create-domain --domain awscicd \
    --query 'domain.[name,status]' --output text
awscicd Active

$ aws codeartifact create-repository --domain awscicd --repository demo-repo \
    --query 'repository.name' --output text
demo-repo

The newly created repo is empty. Connect it to public PyPI so it can proxy packages from there:

$ aws codeartifact associate-external-connection --domain awscicd \
    --repository demo-repo --external-connection public:pypi \
    --query 'repository.externalConnections[0].externalConnectionName' --output text
public:pypi

Now demo-repo both stores your private packages and pulls packages from PyPI on demand — public packages passing through it get cached, so next time there is no trip to the Internet and no worry about the upstream disappearing.

Auth token: the login mechanism

CodeArtifact does not use a separate username/password per tool. It issues a temporary authorization token (12 hours by default) derived from your own AWS credentials, then configures the tool (pip/npm/twine) to use that token. The login command does all of this. Log in for twine (the tool that publishes Python packages):

$ aws codeartifact login --tool twine --domain awscicd --repository demo-repo
Successfully configured twine to use AWS CodeArtifact repository
  https://awscicd-111122223333.d.codeartifact.ap-southeast-1.amazonaws.com/pypi/demo-repo/
Login expires in 12 hours at 2026-05-26 01:01:27+07:00

The token expires after 12 hours and then you must login again — like the service role's temporary credentials in Article 2, there is no fixed password to leak.

Publish a package

Create a minimal Python package then build it:

$ cat pyproject.toml
[project]
name = "cicddemo-hello"
version = "0.1.0"
...
$ python3 -m build --wheel --sdist
Successfully built cicddemo_hello-0.1.0.tar.gz and cicddemo_hello-0.1.0-py3-none-any.whl

Upload to CodeArtifact via twine (logged in above):

$ twine upload --repository codeartifact dist/*
Uploading cicddemo_hello-0.1.0-py3-none-any.whl
Uploading cicddemo_hello-0.1.0.tar.gz

Verify the package is now in the repo:

$ aws codeartifact list-package-versions --domain awscicd --repository demo-repo \
    --format pypi --package cicddemo-hello \
    --query 'versions[].[version,status]' --output text
0.1.0   Published

Consume: reinstall from CodeArtifact

Log pip in to the same repo, then install the package just published:

$ aws codeartifact login --tool pip --domain awscicd --repository demo-repo
Login expires in 12 hours ...

$ pip install cicddemo-hello
$ python -c "import cicddemo_hello; print(cicddemo_hello.hello())"
hello from CodeArtifact package

The private package just published installed and ran, pulled from CodeArtifact rather than PyPI. That same repo also serves public packages (via the external connection), so a single pip install gets both private and public packages from one source.

   pip / twine
      │  login → temporary auth token (12h, from AWS credentials)
      ▼
   ┌──────────── CodeArtifact: domain awscicd / repo demo-repo ───────────┐
   │   private package:  cicddemo-hello 0.1.0  (twine upload)              │
   │   external connection → public:pypi  (proxy + cache public packages)  │
   └───────────────────────────────────────────────────────────────────────┘

Wire into CodeBuild

To have the build pull packages from CodeArtifact, do two things. One, grant the CodeBuild service role permission to get a token and read the repo:

{
  "Effect": "Allow",
  "Action": ["codeartifact:GetAuthorizationToken","codeartifact:GetRepositoryEndpoint","codeartifact:ReadFromRepository"],
  "Resource": "*"
},
{
  "Effect": "Allow", "Action": "sts:GetServiceBearerToken", "Resource": "*",
  "Condition": { "StringEquals": { "sts:AWSServiceName": "codeartifact.amazonaws.com" } }
}

Two, add the login step to the install phase of buildspec.yml — the build gets the token itself using its role, no key needed:

phases:
  install:
    commands:
      - aws codeartifact login --tool pip --domain awscicd --repository demo-repo
      - pip install -r requirements.txt   # now pulls from CodeArtifact

The sts:GetServiceBearerToken permission is the easy-to-forget piece: a CodeArtifact token is in fact a bearer token issued by STS, so without this permission login inside the build fails even with the codeartifact permissions in place.

🧹 Cleanup

$ aws codeartifact delete-repository --domain awscicd --repository demo-repo
$ aws codeartifact delete-domain --domain awscicd

CodeArtifact charges by storage and requests; a few small packages cost almost nothing, but you should still delete it when done learning.

Wrap-up

CodeArtifact is a managed package repository: a domain groups repositories, a repository stores packages, an external connection (public:pypi, public:npmjs...) lets it proxy and cache public registries. Authentication uses a temporary 12-hour auth token derived from your AWS credentials via aws codeartifact login, no fixed password. We published a Python package then reinstalled it from the repository, and wired it into CodeBuild by granting the role GetAuthorizationToken + sts:GetServiceBearerToken then running login in the install phase. The result: builds pull both private and public packages from one controlled source.

Part III closes here. Part IV is the longest and most important section: CodeDeploy putting artifacts onto EC2. The next article lays the foundation — install the agent, create an application and deployment group, write appspec.yml, and run the first in-place deploy to a real EC2 instance.