Update Parsing Logic for Contest Page Structure #10

Closed
opened 2024-10-02 10:29:29 +00:00 by kumi · 1 comment
Owner

Description:

When accessing the /contest/ endpoint, the application raises an IndexError due to changes in the HTML structure of the Instructables contest page. This issue was initially reported by @vlnst in #9.

Error Traceback:

ERROR:structables.main:Exception on /contest/ [GET]
Traceback (most recent call last):
File "/opt/venv/lib/python3.12/site-packages/flask/app.py", line 1473, in wsgi_app
response = self.full_dispatch_request()
...
File "/opt/venv/lib/python3.12/site-packages/structables/routes/contest.py", line 123, in route_contests
contest_count = str(soup.select("p.contest-count")[0])
IndexError: list index out of range

Cause:

The issue stems from a change on the Instructables website, which now affects our selector p.contest-count. The current selector does not find any matching elements, causing an IndexError.

Steps to Reproduce:

  • Deploy the current application version.
  • Access the /contest/ endpoint.
  • Observe the IndexError in the logs due to the missing HTML element.

Expected Behavior:

The application should correctly parse the Instructables contest page and handle cases where the expected elements are missing gracefully.

Solution:

  • Review and update the HTML parsing logic in the route_contests function of structables/routes/contest.py to accommodate the new structure of the Instructables contest page.
  • Implement error handling to provide a fallback message or default value when an element is not found.
**Description:** When accessing the /contest/ endpoint, the application raises an IndexError due to changes in the HTML structure of the Instructables contest page. This issue was initially reported by @vlnst in #9. **Error Traceback:** > ERROR:structables.main:Exception on /contest/ [GET] > Traceback (most recent call last): > File "/opt/venv/lib/python3.12/site-packages/flask/app.py", line 1473, in wsgi_app > response = self.full_dispatch_request() > ... > File "/opt/venv/lib/python3.12/site-packages/structables/routes/contest.py", line 123, in route_contests > contest_count = str(soup.select("p.contest-count")[0]) > IndexError: list index out of range **Cause:** The issue stems from a change on the Instructables website, which now affects our selector p.contest-count. The current selector does not find any matching elements, causing an IndexError. **Steps to Reproduce:** - Deploy the current application version. - Access the /contest/ endpoint. - Observe the IndexError in the logs due to the missing HTML element. **Expected Behavior:** The application should correctly parse the Instructables contest page and handle cases where the expected elements are missing gracefully. **Solution:** - [x] Review and update the HTML parsing logic in the route_contests function of structables/routes/contest.py to accommodate the new structure of the Instructables contest page. - [x] Implement error handling to provide a fallback message or default value when an element is not found.
Author
Owner

Fixed in v0.3.13.

Fixed in `v0.3.13`.
kumi closed this issue 2024-10-04 05:11:42 +00:00
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: PrivateCoffee/structables#10
No description provided.