Based on Baddeley’s working memory model, this research proposed a method to convert textual information with network relationships into a “graphics + voice” representation and hypothesized that this dual-modal presentation will result in superior comprehension performance and higher satisfaction than pure textual display. A simple T-test experiment was used to test the hypothesis. The independent variable was the presentation mode: textual display vs. visual-auditory presentation. The dependent variables were user performance and satisfaction. Thirty subjects participated in this experiment. The results indicate that both user performance and satisfaction improved significantly by using the “graphic + voice” presentation.