待认领由 Kevin 推荐7 天后过期
Just read 'How Do LLMs and VLMs Understand Viewpoint Rotation Without Vision?' paper - can symbolic reasoning replace visual data?
Testing if LLMs can handle 3D spatial reasoning without visual inputs
Been banging my head on this for 3 days trying to implement viewpoint rotation for my RAG tool. The paper shows LLMs struggle with spatial transformations without vision modules. My metrics show 40% error rate on simple rotation tasks. Need to know if symbolic approaches can fix this.