dataset get file list broken for v2, route changed on v2 backend #100

tcnichol · 2024-03-30T17:56:22Z

To replicate the bug:

Run a dataset extractor (I ran the huggingface ray extractor.)

In connectors._build_resource these lines cause an error:

 datasetinfo = pyclowder.datasets.get_info(self, host, secret_key, datasetid)
                filelist = pyclowder.datasets.get_file_list(self, host, secret_key, datasetid)
                triggering_file = None
                for f in filelist:
                    if f['id'] == fileid:
                        triggering_file = f['filename']
                        break

This is because what is returned by the /files endpoint in the dataset router has changed. Before it was just a list of files, and now the response is the Paged response, which contains a "metadata" and "data." When it tries to find the f["id"] in those objects, it is not there so it throws an error.

My fix will be to have datasets.get_file_list to return the files, which is the "data" field coming from the backend route.

…s related to the pagination, and the data is the actual list of files. We want to get the list of files here only, or else other parts of the code will break.

max-zilla · 2024-05-28T14:18:23Z

pyclowder/datasets.py

@@ -128,7 +128,7 @@ def get_file_list(connector, host, key, datasetid):
    datasetid -- the dataset to get filelist of
    """
    client = ClowderClient(host=host, key=key)
-    file_list = datasets.get_file_list(connector, client, datasetid)
+    file_list = datasets.get_file_list(connector, client, datasetid)['data']


shouldn't this change be in pyclowder/v2/datasets.py rather than here? i dont think the v1 endpoint will have a data field, so it might break back compatibility

This should be fixed now.

Vismayak · 2024-06-12T15:28:13Z

Just needs to be rebased but works! 😄

…-of-list-of-files

the backend route returns 2 fields: metadata and data. The metadata i…

Loading
Loading status checks…

a5486c1

…s related to the pagination, and the data is the actual list of files. We want to get the list of files here only, or else other parts of the code will break.

tcnichol linked an issue Mar 30, 2024 that may be closed by this pull request

dataset file list returns paged instead of list of files #99

Closed

tcnichol requested review from ddey2 and max-zilla March 30, 2024 17:56

tcnichol changed the title ~~the backend route returns 2 fields: metadata and data. The metadata i…~~ dataset get file list broken for v2, route changed on v2 backend Mar 30, 2024

tcnichol requested review from longshuicy and lmarini March 30, 2024 19:56

max-zilla requested changes May 28, 2024

View reviewed changes

moving this to the v2 method

Loading
Loading status checks…

0c8cd64

Vismayak approved these changes Jun 12, 2024

View reviewed changes

ddey2 approved these changes Jun 12, 2024

View reviewed changes

Merge branch 'master' into 99-dataset-file-list-returns-paged-instead…

Loading
Loading status checks…

2b358c1

…-of-list-of-files

max-zilla approved these changes Jun 17, 2024

View reviewed changes

max-zilla merged commit 912423a into master Jun 17, 2024
9 checks passed

max-zilla deleted the 99-dataset-file-list-returns-paged-instead-of-list-of-files branch June 17, 2024 17:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dataset get file list broken for v2, route changed on v2 backend #100

dataset get file list broken for v2, route changed on v2 backend #100

tcnichol commented Mar 30, 2024 •

edited

Loading

max-zilla May 28, 2024

tcnichol May 28, 2024

Vismayak commented Jun 12, 2024

dataset get file list broken for v2, route changed on v2 backend #100

dataset get file list broken for v2, route changed on v2 backend #100

Conversation

tcnichol commented Mar 30, 2024 • edited Loading

max-zilla May 28, 2024

Choose a reason for hiding this comment

tcnichol May 28, 2024

Choose a reason for hiding this comment

Vismayak commented Jun 12, 2024

tcnichol commented Mar 30, 2024 •

edited

Loading