I'm looking for visual document understanding, not just text extraction - the system needs to recognize that a red warning box is more important than regular paragraphs, not just pull out the words. Traditional OCR would extract 'WARNING' as plain text, but I need something that understands the visual context: colors, positioning, boxes, and formatting all carry meaning about priority and relationships. Basically, I am hoping for something that can answer 'why is this text important?' not just 'what does this text say?