Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 12 additions & 5 deletions libs/shared/shared/torngit/gitlab.py
Original file line number Diff line number Diff line change
Expand Up @@ -308,7 +308,7 @@
"enterprise_counter": GITLAB_E_API_CALL_COUNTER.labels(
endpoint="list_repos_get_user_and_groups"
),
"url_template": Template("/groups?per_page=100"),
"url_template": Template("/groups?per_page=100&page=${page}"),
},
"list_repos_get_owned_projects": {
"counter": GITLAB_API_CALL_COUNTER.labels(
Expand Down Expand Up @@ -722,6 +722,7 @@ async def list_repos(self, username=None, token=None):
user_url = self.count_and_get_url_template("list_repos_get_user").substitute()
user = await self.api("get", user_url, token=token)
user["is_user"] = True
groups = []
if username:
if username.lower() == user["username"].lower():
# just me
Expand All @@ -734,10 +735,16 @@ async def list_repos(self, username=None, token=None):
groups = [(await self.api("get", groups_url, token=token))]
else:
# user and all groups
url = self.count_and_get_url_template(
"list_repos_get_user_and_groups"
).substitute()
groups = await self.api("get", url, token=token)
page = 0
while True:
page += 1
url = self.count_and_get_url_template(
"list_repos_get_user_and_groups"
).substitute(page=page)
groups_paged = await self.api("get", url, token=token)
groups += groups_paged
if len(groups) < 100:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential bug: Incorrect pagination logic causes infinite loops and data loss when listing GitLab repos.
  • Description: The pagination loop for fetching GitLab groups has a flawed break condition. It checks the total number of accumulated groups (len(groups)) instead of the number of groups returned on the current page (len(groups_paged)). This can cause an infinite loop if an organization has enough groups to fill multiple pages (e.g., 100 or more), leading to resource exhaustion and a potential service crash. It can also lead to data loss by terminating prematurely if the first page has fewer than 100 groups.

  • Suggested fix: The break condition should check the number of items on the current page, not the accumulated total. Change if len(groups) &lt; 100: to if len(groups_paged) &lt; 100:.
    severity: 0.9, confidence: 1.0

Did we get this right? 👍 / 👎 to inform future reviews.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This handling of pagination logic was taken from else where in the existing GitLab code, and opted for that to stay consistent with existing standards. There are returned pagination headers on the GitLab API responses we should be using across the board instead to handle the pagination properly, but I view that as out of scope for this PR, as we are following existing methods.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nathanbrophy there's actually a python gitlab library which abstracts a bunch of this away which could be a useful solution.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I think that would be a good follow on to replace the gitlab client here with the published GitLab sponsored one.

break
groups.append(user)

data = []
Expand Down