[BUG] Cannot upload large CVAT annotation tasks: Request Entity Too Large

See original GitHub issue

Hi,

When uploading to CVAT I am receiving a OSError: [Errno 24] Too many open files.

The scripts works fine with a couple of test images, but as soon as I move to a large dataset ~16,000 images it will crash with the above error.

Things I have tried:

  • Tested with small dataset and confirmed working.
  • Tested with 2 x different CVAT servers (inlcluding CVAT.org)
  • Increased the ulimit -a on the ubuntu 18 machine to 1048576
  • Attempted to use segments to upload.

Appreciate any suggestions or help.

Full trace:

100% |████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16683/16683 [21.9s elapsed, 0s remaining, 972.2 samples/s]      
Name:        Dataset-10
Media type:  image
Num samples: 16683
Persistent:  True
Tags:        []
Sample fields:
    id:           fiftyone.core.fields.ObjectIdField
    filepath:     fiftyone.core.fields.StringField
    tags:         fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:     fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.Metadata)
    ground_truth: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
Uploading samples to CVAT...
Computing image metadata...
 100% |████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16683/16683 [20.7s elapsed, 0s remaining, 963.0 samples/s]      
Uncaught exception
Traceback (most recent call last):
  File "/home/inviol/dev/AI_tools/data_management/fiftyone/fiftyone_yolo_dataset_to_CVAT.py", line 53, in <module>
    password="g_nekgrHj33JPCa"
  File "/home/inviol/dev/AI_tools/venv/lib/python3.6/site-packages/fiftyone/core/collections.py", line 6188, in annotate
  File "/home/inviol/dev/AI_tools/venv/lib/python3.6/site-packages/fiftyone/utils/annotations.py", line 212, in annotate
  File "/home/inviol/dev/AI_tools/venv/lib/python3.6/site-packages/fiftyone/utils/cvat.py", line 2657, in upload_annotations
  File "/home/inviol/dev/AI_tools/venv/lib/python3.6/site-packages/fiftyone/utils/cvat.py", line 3516, in upload_samples
  File "/home/inviol/dev/AI_tools/venv/lib/python3.6/site-packages/fiftyone/utils/cvat.py", line 3942, in _create_task_upload_data
  File "/home/inviol/dev/AI_tools/venv/lib/python3.6/site-packages/fiftyone/utils/cvat.py", line 3374, in upload_data
OSError: [Errno 24] Too many open files: '/home/inviol/inviol/new_model/new_data/Dataset-10-Vulcan-Device/19.10.2021 3969.jpg'

Script:

import fiftyone as fo

name = "Dataset-10"

fo.delete_dataset(name) #If you have a persistent DB you want to delete

dataset_dir = "/home/inviol/inviol/new_model/new_data"

dataset_type = fo.types.YOLOv4Dataset  

dataset = fo.Dataset.from_dir(
    dataset_dir=dataset_dir,
    dataset_type=dataset_type,
    name=name,
)

dataset = dataset=fo.load_dataset(name)
dataset.persistent = True
print(dataset)

view = dataset.view()

anno_key = "dataset_10_vulcan_12"

view.annotate(
    anno_key,
    label_field="ground_truth",
    attributes=False,
    launch_editor=False,
    #url="",
    #username="",
    #segment_size=100,
    #password="",
    #image_quality=100,
    username="",
    password=""
)
print(dataset.get_annotation_info(anno_key))

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
brimoorcommented, Nov 26, 2021

In the meantime, a simple workaround is to split the tasks into multiple smaller runs yourself.

Adding the runs to a project would be a good way to organize the tasks under one roof:

# The dataset or view you want to annotate
dataset = ...

dataset[:1000].annotate("run1", ..., project_name="your-project")
dataset[1000:2000].annotate("run2", ..., project_name="your-project")
...

dataset.load_annotations("run1")
dataset.load_annotations("run2")
...
1reaction
brimoorcommented, Nov 26, 2021

Yeah looks like the implementation of upload_data() needs to be updated to upload media for large tasks in batches to satisfy any size limits on the CVAT API side.

@ehofesmann can you help with this?

https://github.com/voxel51/fiftyone/blob/935cad90646364c97834fdd935f9be6141d504d1/fiftyone/utils/cvat.py#L3327

Read more comments on GitHub >

github_iconTop Results From Across the Web

CVAT: Cannot upload annotation files larger than 1 MB (413 ...
I sucessfully created a task with 750 images and now want to upload an annotation file (COCO) that is of file size 1.2MB....
Read more >
opencv-cvat/public - Gitter
I am trying to create tasks using the Django REST API, however, the created tasks are not appearing in the CVAT UI task...
Read more >
CVAT Integration — FiftyOne 0.18.0 documentation - Voxel51
FiftyOne provides an API to create tasks and jobs, upload data, define label schemas, and download annotations using CVAT, all programmatically in Python....
Read more >
Software Review: What Is CVAT Annotation Tool?
As follows, CVAT is being used for computer vision (CV) tasks that are based on image and video data sets. The CVAT labeling...
Read more >
Towards a better understanding of annotation tools for ...
Despite the huge success of deep learning algorithms in image analysis, training algorithms to reach human-level performance in these tasks ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found