AWS S3 object 1000개 이상일 때, 조회하는 방법

AWS/S3

AWS S3 object 1000개 이상일 때, 조회하는 방법

알면 알 수록 재밌다! 2023. 3. 6. 07:00

Python SDK Boto3를 사용하여 S3 Bucket Object가 1000개 이상일 때 Object의 List를 조회하는 방법

S3 — Boto3 Docs 1.26.81 documentation

The following example shows how to initiate restoration of glacier objects in an Amazon S3 bucket, determine if a restoration is on-going, and determine if a restoration is finished. import boto3 s3 = boto3.resource('s3') bucket = s3.Bucket('glacier-bucket

boto3.amazonaws.com

보통 위 방법을 추천한다고 한다.

import boto3
client = boto3.client('s3')
response = client.list_objects_v2(Bucket='your bucket name', Prefix='your directory name')
list = [keys['Key'] for keys in response['Contents']]
print("length", len(list))

length: 1000

위처럼 1000개만 리턴한다.

while response['IsTruncated']
response = client.list_objects_v2(Bucket='your bucket name', Prefix='your directory name', ContinuationToken=response['NextContinuationToken'])
list += [keys['Key'] for keys in res['Contents']]
print("length ", len(list))

IsTruncated 값이 false면 더이상 응답할 파일이 없는 것이고 true면 아직 응답할 파일들이 남아있는 것이다. 그래서 true면 계속 objects라는 배열에 응답 파일들을 붙여준다.

NextContinuationToken은 IsTruncated가 true일 때 응답되어진다. 이 난독화된 key를 가지고 다음 요청에 ContinuationToken로 전달하게 되면 그 다음 파일들을 가져올 수 있게 된다. 결국 페이지네이션을 해주는 값이라고 보면 된다.

length: 1132

위와 같은 코드로 list_objects_v2의 모든 페이지를 page_iterator에서 사용 할 수 있게 된다.

참고

https://dev.classmethod.jp/articles/boto3_s3_object_more_then_1000_kr/

https://velog.io/@samnaka/How-to-get-over-1000-objects-from-s3