Text-to-Image Alignment Performance of the pixart-sigma Model #25

xiexiaoshinick · 2024-04-10T07:21:44Z

First of all, I would like to express my gratitude for your open-source pixart-sigma project. As a developer who has been closely following your work, I couldn't wait to test the new model as soon as it was released. I used the GenEval framework to evaluate the model's performance in text-to-image alignment. The results showed that compared to SDXL and PlayGroundv2.5, there is still room for improvement in this aspect.

模型	Overall	single	two	counting	colors	position	color_attr
SD1.5	42.34	95.62	37.63	37.81	74.73	3.50	4.75
SD1.5-DPO	43.00	96.88	39.90	38.75	75.53	3.25	3.75
SDXL	55.63	98.12	75.25	43.75	89.63	11.25	15.75
playgroundv2.5	56.37	97.81	77.02	51.88	83.78	11.00	16.75
SDXL-DPO	58.02	99.38	82.58	49.06	85.11	13.50	18.50
PixArt-⍺（1024）	47.16	97.81	46.21	45.00	77.93	9.00	7.00
PixArt-Σ (512)	52.03	98.12	59.02	50.62	80.05	9.75	15.50
PixArt-Σ (1024)	54.39	98.44	62.88	49.69	82.45	12.00	20.00

I noticed that Stable Diffusion 3 adopted the DPO (Direct Preference Optimization) method, which greatly improved the text-to-image alignment. In this regard, I would like to ask if your team has any plans to incorporate similar optimization methods in future versions to further enhance the model's performance in this area.

lawrence-cj · 2024-04-10T10:38:28Z

Thank you so much for your work and help. DPO will definitely help to get consistent improvement. Actually, we would prefer to encourage our community members to do their specific DPO, not just do everything on our own~.

ApolloRay · 2024-04-15T02:39:25Z

Thank you so much for your work and help. DPO will definitely help to get consistent improvement. Actually, we would prefer to encourage our community members to do their specific DPO, not just do everything on our own~.

I will try this. However, the sigma-DMD will be released ?

lawrence-cj · 2024-04-15T02:42:28Z

Already released! Refer to the readme:)

ApolloRay · 2024-04-17T06:21:33Z

First of all, I would like to express my gratitude for your open-source pixart-sigma project. As a developer who has been closely following your work, I couldn't wait to test the new model as soon as it was released. I used the GenEval framework to evaluate the model's performance in text-to-image alignment. The results showed that compared to SDXL and PlayGroundv2.5, there is still room for improvement in this aspect.

模型 Overall single two counting colors position color_attr
SD1.5 42.34 95.62 37.63 37.81 74.73 3.50 4.75
SD1.5-DPO 43.00 96.88 39.90 38.75 75.53 3.25 3.75
SDXL 55.63 98.12 75.25 43.75 89.63 11.25 15.75
playgroundv2.5 56.37 97.81 77.02 51.88 83.78 11.00 16.75
SDXL-DPO 58.02 99.38 82.58 49.06 85.11 13.50 18.50
PixArt-⍺（1024） 47.16 97.81 46.21 45.00 77.93 9.00 7.00
PixArt-Σ (512) 52.03 98.12 59.02 50.62 80.05 9.75 15.50
PixArt-Σ (1024) 54.39 98.44 62.88 49.69 82.45 12.00 20.00
I noticed that Stable Diffusion 3 adopted the DPO (Direct Preference Optimization) method, which greatly improved the text-to-image alignment. In this regard, I would like to ask if your team has any plans to incorporate similar optimization methods in future versions to further enhance the model's performance in this area.

I meet this problem, durring use the GenEval,
ordering = np.argsort(bbox[index][:, 4])[::-1] TypeError: 'DetDataSample' object is not subscriptable
Have you met this problem ?

lawrence-cj added the good first issue Good for newcomers label Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Text-to-Image Alignment Performance of the pixart-sigma Model #25

Text-to-Image Alignment Performance of the pixart-sigma Model #25

xiexiaoshinick commented Apr 10, 2024

lawrence-cj commented Apr 10, 2024

ApolloRay commented Apr 15, 2024

lawrence-cj commented Apr 15, 2024

ApolloRay commented Apr 17, 2024

Text-to-Image Alignment Performance of the pixart-sigma Model #25

Text-to-Image Alignment Performance of the pixart-sigma Model #25

Comments

xiexiaoshinick commented Apr 10, 2024

lawrence-cj commented Apr 10, 2024

ApolloRay commented Apr 15, 2024

lawrence-cj commented Apr 15, 2024

ApolloRay commented Apr 17, 2024