Katana VentraIP

Data and information visualization

Data and information visualization (data viz/vis or info viz/vis)[2] is the practice of designing and creating easy-to-communicate and easy-to-understand graphic or visual representations of a large amount[3] of complex quantitative and qualitative data and information with the help of static, dynamic or interactive visual items. Typically based on data and information collected from a certain domain of expertise, these visualizations are intended for a broader audience to help them visually explore and discover, quickly understand, interpret and gain important insights into otherwise difficult-to-identify structures, relationships, correlations, local and global patterns, trends, variations, constancy, clusters, outliers and unusual groupings within data (exploratory visualization).[4][5][6] When intended for the general public (mass communication) to convey a concise version of known, specific information in a clear and engaging manner (presentational or explanatory visualization),[4] it is typically called information graphics.

Data visualization is concerned with visually presenting sets of primarily quantitative raw data in a schematic form. The visual formats used in data visualization include tables, charts and graphs (e.g. pie charts, bar charts, line charts, area charts, cone charts, pyramid charts, donut charts, histograms, spectrograms, cohort charts, waterfall charts, funnel charts, bullet graphs, etc.), diagrams, plots (e.g. scatter plots, distribution plots, box-and-whisker plots), geospatial maps (such as proportional symbol maps, choropleth maps, isopleth maps and heat maps), figures, correlation matrices, percentage gauges, etc., which sometimes can be combined in a dashboard.


Information visualization, on the other hand, deals with multiple, large-scale and complicated datasets which contain quantitative (numerical) data as well as qualitative (non-numerical, i.e. verbal or graphical) and primarily abstract information and its goal is to add value to raw data, improve the viewers' comprehension, reinforce their cognition and help them derive insights and make decisions as they navigate and interact with the computer-supported graphical display. Visual tools used in information visualization include maps (such as tree maps), animations, infographics, Sankey diagrams, flow charts, network diagrams, semantic networks, entity-relationship diagrams, venn diagrams, timelines, mind maps, etc.


Emerging technologies like virtual, augmented and mixed reality have the potential to make information visualization more immersive, intuitive, interactive and easily manipulable and thus enhance the user's visual perception and cognition.[7] In data and information visualization, the goal is to graphically present and explore abstract, non-physical and non-spatial data collected from databases, information systems, file systems, documents, business and financial data, etc. (presentational and exploratory visualization) which is different from the field of scientific visualization, where the goal is to render realistic images based on physical and spatial scientific data to confirm or reject hypotheses (confirmatory visualization).[8]


Effective data visualization is properly sourced, contextualized, simple and uncluttered. The underlying data is accurate and up-to-date to make sure that insights are reliable. Graphical items are well-chosen for the given datasets and aesthetically appealing, with shapes, colors and other visual elements used deliberately in a meaningful and non-distracting manner. The visuals are accompanied by supporting texts (labels and titles). These verbal and graphical components complement each other to ensure clear, quick and memorable understanding. Effective information visualization is aware of the needs and concerns and the level of expertise of the target audience, deliberately guiding them to the intended conclusion.[9][3] Such effective visualization can be used not only for conveying specialized, complex, big data-driven ideas to a wider group of non-technical audience in a visually appealing, engaging and accessible manner, but also to domain experts and executives for making decisions, monitoring performance, generating new ideas and stimulating research.[9][4] In addition, data scientists, data analysts and data mining specialists use data visualization to check the quality of data, find errors, unusual gaps and missing values in data, clean data, explore the structures and features of data and assess outputs of data-driven models.[4] In business, data and information visualization can constitute a part of data storytelling, where they are paired with a coherent narrative structure or storyline to contextualize the analyzed data and communicate the insights gained from analyzing the data clearly and memorably with the goal of convincing the audience into making a decision or taking an action in order to create business value.[3][10] This can be contrasted with the field of statistical graphics, where complex statistical data are communicated graphically in an accurate and precise manner among researchers and analysts with statistical expertise to help them perform exploratory data analysis or to convey the results of such analyses, where visual appeal, capturing attention to a certain issue and storytelling are not as important.[11]


The field of data and information visualization is of interdisciplinary nature as it incorporates principles found in the disciplines of descriptive statistics (as early as the 18th century),[12] visual communication, graphic design, cognitive science and, more recently, interactive computer graphics and human-computer interaction.[13] Since effective visualization requires design skills, statistical skills and computing skills, it is argued by authors such as Gershon and Page that it is both an art and a science.[14] The neighboring field of visual analytics marries statistical data analysis, data and information visualization and human analytical reasoning through interactive visual interfaces to help human users reach conclusions, gain actionable insights and make informed decisions which are otherwise difficult for computers to do.


Research into how people read and misread various types of visualizations is helping to determine what types and features of visualizations are most understandable and effective in conveying information.[15][16] On the other hand, unintentionally poor or intentionally misleading and deceptive visualizations (misinformative visualization) can function as powerful tools which disseminate misinformation, manipulate public perception and divert public opinion toward a certain agenda.[17] Thus data visualization literacy has become an important component of data and information literacy in the information age akin to the roles played by textual, mathematical and visual literacy in the past.[18]

show the data

induce the viewer to think about the substance rather than about methodology, graphic design, the technology of graphic production, or something else

avoid distorting what the data has to say

present many numbers in a small space

make large data sets coherent

encourage the eye to compare different pieces of data

reveal the data at several levels of detail, from a broad overview to the fine structure

serve a reasonably clear purpose: description, exploration, tabulation, or decoration

be closely integrated with the statistical and verbal descriptions of a data set.

Categorical: Represent groups of objects with a particular characteristic. Categorical variables can either be nominal or ordinal. Nominal variables for example gender have no order between them and are thus nominal. Ordinal variables are categories with an order, for sample recording the age group someone falls into.

[50]

Quantitative: Represent measurements, such as the height of a person or the temperature of an environment. Quantitative variables can either be . Continuous variables capture the idea that measurements can always be made more precisely. While discrete variables have only a finite number of possibilities, such as a count of some outcomes or an age measured in whole years.[50]

continuous or discrete

Data visualization involves specific terminology, some of which is derived from statistics. For example, author Stephen Few defines two types of data, which are used in combination to support a meaningful analysis or visualization:


The distinction between quantitative and categorical variables is important because the two types require different methods of visualization.


Two primary types of information displays are tables and graphs.


Eppler and Lengler have developed the "Periodic Table of Visualization Methods," an interactive chart displaying various data visualization methods. It includes six types of data visualization methods: data, information, concept, strategy, metaphor and compound.[52] In "Visualization Analysis and Design" Tamara Munzner writes "Computer-based visualization systems provide visual representations of datasets designed to help people carry out tasks more effectively." Munzner agues that visualization "is suitable when there is a need to augment human capabilities rather than replace people with computational decision-making methods."[53]

Cartogram

(phylogeny)

Cladogram

Concept Mapping

(classification)

Dendrogram

Information visualization reference model

Grand tour

Graph drawing

Heatmap

HyperbolicTree

Multidimensional scaling

Parallel coordinates

Problem solving environment

Treemapping

: works by using the mouse to control a paintbrush, directly changing the color or glyph of elements of a plot. The paintbrush is sometimes a pointer and sometimes works by drawing an outline of sorts around points; the outline is sometimes irregularly shaped, like a lasso. Brushing is most commonly used when multiple plots are visible and some linking mechanism exists between the plots. There are several different conceptual models for brushing and a number of common linking mechanisms. Brushing scatterplots can be a transient operation in which points in the active plot only retain their new characteristics. At the same time, they are enclosed or intersected by the brush, or it can be a persistent operation, so that points retain their new appearance after the brush has been moved away. Transient brushing is usually chosen for linked brushing, as we have just described.

Brushing

Painting: Persistent brushing is useful when we want to group the points into clusters and then proceed to use other operations, such as the tour, to compare the groups. It is becoming common terminology to call the persistent operation painting,

Identification: which could also be called labeling or label brushing, is another plot manipulation that can be linked. Bringing the cursor near a point or edge in a scatterplot, or a bar in a , causes a label to appear that identifies the plot element. It is widely available in many interactive graphics, and is sometimes called mouseover.

barchart

Scaling: maps the data onto the window, and changes in the area of the. mapping function help us learn different things from the same plot. Scaling is commonly used to zoom in on crowded regions of a scatterplot, and it can also be used to change the aspect ratio of a plot, to reveal different features of the data.

: connects elements selected in one plot with elements in another plot. The simplest kind of linking, one-to-one, where both plots show different projections of the same data, and a point in one plot corresponds to exactly one point in the other. When using area plots, brushing any part of an area has the same effect as brushing it all and is equivalent to selecting all cases in the corresponding category. Even when some plot elements represent more than one case, the underlying linking rule still links one case in one plot to the same case in other plots. Linking can also be by categorical variable, such as by a subject id, so that all data values corresponding to that subject are highlighted, in all the visible plots.

Linking

Interactive data visualization enables direct actions on a graphical plot to change elements and link between multiple plots.[56]


Interactive data visualization has been a pursuit of statisticians since the late 1960s. Examples of the developments can be found on the American Statistical Association video lending library.[57]


Common interactions include:

& resources

Articles

Displaying

connections

Displaying

data

Displaying

news

Displaying

websites

Mind maps

Tools and services

There are different approaches on the scope of data visualization. One common focus is on information presentation, such as Friedman (2008). Friendly (2008) presumes two main parts of data visualization: statistical graphics, and thematic cartography.[58] In this line the "Data Visualization: Modern Approaches" (2007) article gives an overview of seven subjects of data visualization:[59]


All these subjects are closely related to graphic design and information representation.


On the other hand, from a computer science perspective, Frits H. Post in 2002 categorized the field into sub-fields:[26][60]


Within The Harvard Business Review, Scott Berinato developed a framework to approach data visualisation.[61] To start thinking visually, users must consider two questions; 1) What you have and 2) what you're doing. The first step is identifying what data you want visualised. It is data-driven like profit over the past ten years or a conceptual idea like how a specific organisation is structured. Once this question is answered one can then focus on whether they are trying to communicate information (declarative visualisation) or trying to figure something out (exploratory visualisation). Scott Berinato combines these questions to give four types of visual communication that each have their own goals.[61]


These four types of visual communication are as follows;

Scientific research

Digital libraries

Data mining

Information graphics

Financial data analysis

[62]

Health care

Market studies

Manufacturing

production control

Crime mapping

and Policy Modeling

eGovernance

Digital Humanities

Data Art

Data and information visualization insights are being applied in areas such as:[19]

Adobe Research

IBM Research

Google Research

Microsoft Research

Panopticon Software

Scientific Computing and Imaging Institute

Tableau Software

University of Maryland Human-Computer Interaction Lab

Notable academic and industry laboratories in the field are:


Conferences in this field, ranked by significance in data visualization research,[63] are:


For further examples, see: Category:Computer graphics organizations

To use data to provide knowledge in the most efficient manner possible (minimize noise, complexity, and unnecessary data or detail given each audience's needs and roles)

To use data to provide knowledge in the most effective manner possible (provide relevant, timely and complete data to each audience member in a clear and understandable manner that conveys important meaning, is actionable and can affect understanding, behavior and decisions)

Tableau: A powerful and flexible tool that allows users to create a wide variety of interactive and shareable dashboards.

Power BI: A business analytics service by Microsoft that provides interactive visualizations and business intelligence capabilities with an interface simple enough for end users to create their own reports and dashboards.

Enqdb(): A versatile data visualization tool that provides customizable dashboards, advanced data analytics to help businesses gain insights from their data.

http://www.enqdb.com

D3.js: A JavaScript library for producing dynamic, interactive data visualizations in web browsers using SVG, HTML, and CSS.

Google Data Studio: A free tool that turns data into informative, easy-to-read, easy-to-share, and fully customizable dashboards and reports.

QlikView: A business discovery platform that provides self-service BI for all business users in organizations.

Excel: While traditionally known as a spreadsheet application, Excel offers robust data visualization capabilities, including charts, graphs, and pivot tables.

R and ggplot2: The R programming language and its ggplot2 package are widely used for statistical analysis and visualization.

Python and Matplotlib/Seaborn: Python, with its Matplotlib and Seaborn libraries, is commonly used for creating static, animated, and interactive visualizations.

Plotly: An open-source graphing library that makes interactive, publication-quality graphs online.

There are numerous tools available for data visualization, each with its own strengths and applications. Some of the most widely used tools include:



These tools vary in their complexity, cost, and the level of customization they offer, catering to different needs from simple charting to complex interactive visualizations.

(2019). Data Visualization: A Practical Introduction. Princeton: Princeton University Press. ISBN 978-0-691-18161-5.

Healy, Kieran

Wilke, Claus O. (2018). . O'Reilly. ISBN 978-1-4920-3108-6.

Fundamentals of Data Visualization

Evergreen, Stephanie (2016). Effective Data Visualization: The Right Chart for the Right Data. Sage.  978-1-5063-0305-5.

ISBN

(2015). The visual display of quantitative information (2 ed.). Graphics Press. ISBN 9780961392147.

Tufte, Edward R.

Kawa Nazemi (2014). Eurographics Association.

Adaptive Semantics Visualization

Few, Stephen (2012). Show me the numbers : designing tables and graphs to enlighten (2 ed.). Analytics Press.  9780970601971. OCLC 795009632.

ISBN

(2012). Grammar of Graphics. New York: Springer. ISBN 978-1-4419-2033-1.

Wilkinson, Leland

Mazza, Riccardo (2009). Introduction to Information Visualization. Springer.  9781848002180. OCLC 458726890.

ISBN

Andreas Kerren, John T. Stasko, , and Chris North (2008). Information Visualization – Human-Centered Issues and Perspectives. Volume 4950 of LNCS State-of-the-Art Survey, Springer.

Jean-Daniel Fekete

Information Visualization: Design for Interaction (2nd Edition), Prentice Hall, 2007, ISBN 0-13-206550-9.

Spence, Robert

Jeffrey Heer, , James Landay (2005). "Prefuse: a toolkit for interactive information visualization" Archived 2007-06-12 at the Wayback Machine. In: ACM Human Factors in Computing Systems CHI 2005.

Stuart K. Card

Post, Frits H.; Nielson, Gregory M.; Bonneau, Georges-Pierre (2003). Data Visualization: The State of the Art. New York: Springer.  978-1-4613-5430-7.

ISBN

Colin Ware (2000). . San Francisco, CA: Morgan Kaufmann.

Information Visualization: Perception for design

Cleveland, William S. (1993). . Hobart Press. ISBN 0-9634884-0-6.

Visualizing Data

An illustrated chronology of innovations by Michael Friendly and Daniel J. Denis.

Milestones in the History of Thematic Cartography, Statistical Graphics, and Data Visualization

Duke University-Christa Kelleher Presentation-Communicating through infographics-visualizing scientific & engineering information-March 6, 2015