Business Analytics Track

Writing to excel from SAS, DDE vs ODS

Robert Ellsworth

This paper will demonstrate 3 methods for populating excel report. the paper will compare methods using DDE versus writing CSV and using excel macros versus ODS. The pros and cons of each method will be highlighted.

View paper.

A Quick Look at Fuzzy Matching Programming Techniques Using SAS® Software

Kirk Paul Lafler and Stephen Sloan

Data comes in all forms, shapes, sizes and complexities. Stored in files and data sets, SAS® users across industries know all too well that data can be, and often is, problematic and plagued with a variety of issues. When unique and reliable identifiers, referred to as the key, are available, users routinely are able to match records from two or more data sets using merge, join, and/or hash programming techniques without problem. But, when a unique and reliable identifier is not available, or does not exist, then one or more fuzzy matching programming techniques must be used. Topics include introducing what fuzzy matching is along with examples of the SOUNDEX (for phonetic matching) algorithm, and the SPEDIS, COMPLEV, and COMPGED functions to resolve key identifier issues and to successfully merge, join and match less than perfect or messy data.

View paper.

Python applications in SAS® programming for medical device clinical trial studies

Jianlin Li and Sherry Cao

As Statistical Analysts supporting global medical device regulatory submissions, we routinely deal with requests to generate SAS Table, Figure, and Listing (TFL) in English and the local language, e.g. Chinese or Japanese. We also need to prepare batch files to execute SAS codes to increase programming efficiency. File folder cloning and SAS source code searching and replacement in bulk are also necessary steps for many reporting activities. Text processing is common to many of daily tasks like what are mentioned above.

As a modern general purpose and high-level programming language, python is very powerful, especially in manipulating text data. In this paper, we present a few python applications to improve efficiency and accuracy in SAS programming work: (1) Generic Unicode conversion and handling in SAS programs to support non-English languages; (2) SAS dataset program and SAS TFL program execution batch file generation; (3) Structured folder generation.

SAS macros can be created to generate input parameters and configuration files for Python programs. This enables SAS programmers, without knowing Python languages, to finish tasks by simply calling SAS macros. With its simple syntax rules and readability, Python has more potential applications that could make SAS programmers’ lives easier.

View paper.

I Don't Use Python or R - Why Should I Use Jupyter Notebooks?

Jack Hamilton

Jupyter Notebooks is a web-based tool for editing and running programs in various languages, and for displaying and distributing the results.

Most Jupyter "advertising" to SAS users has focused on how it works with R and Python. But what if you don't know/care about those languages? Can Jupyter Notebooks still be useful?

This presentation starts with an all-SAS demonstration of Jupyter Notebooks, and shows how they can be useful to old-school programmers. A few of the ways: Iterative development, Integrated output, and Stylish documentation.

View paper.

Connecting to Datasets through Python and SAS

Joe Matise

Python is a powerful tool for working with data, and makes the perfect partner to SAS to process data, particularly data that is obtained from online sources where programmers have developed open source Python code to connect to published APIs.

SAS has developed two tools to make working with Python easier for SAS programmers, without needing to know nearly any Python - and also made working with SAS easier for Python programmers who do not know SAS! In this paper, we explore these tools, as well as how to use SAS with Jupyter Notebooks, and show how to download some data from a few data sources.

This paper is aimed at an intermediate level programmer, but does not require particular SAS or Python knowledge and should be comprehensible to anyone. A basic understanding of using APIs to access data can be helpful but is not required.

View paper.

The Write Stuff--Data Driven Dynamic SAS® Coding From Control Tables and Other Datasets

Frank Ferriola

Have you ever written code and found yourself constantly updating it to take care of all the different possibilities? Does it get to a point where it becomes cumbersome, or causes you to hardcode when it needs to be dynamically coded? Consider using data to drive your coding. Set up control tables or use other data to drive your coding and circumstances. This paper will walk you through several ways to build your code for one-time use, or save it out to reuse as needed.

View paper.

Git'r'done: Using Git in the production SAS environment

Joe Matise

SAS Professionals often need to use version control, such as Git, to manage their programs through the various stages of development and production release. In this talk, we will discuss the different options for using Git with SAS programs, including different ideas of the best way to use Git with different SAS clients and their respective file formats, and whether using the integrated Git functionality in different SAS clients is a good idea or not.

This should appeal to SAS developers at all levels who use version control in their daily work, and should not require any particular SAS knowledge.

View paper.

SAS to Excel: ODS Tagsets to Hundred Files or Hundred Tabs

Abdus Shahid

This paper will present methods of outputting dozens or hundreds of tables to an excel workbook as tabs or to hundreds excel files with the use of ODS Tagsets that comes with a table of contents. All the formatting for the tables are done with SAS codes, so that little or no manual formatting is needed once the final output is opened in excel. The method described is especially useful when a large number of tables are needed to be complied in one excel workbook.
Some knowledge of Base SAS coding and XML output is required to fully utilize this process.

View paper.

Spec to Code in Seconds: Use Microsoft Excel for Repetitive Code Writing

Pinky Anandani Dutta

The role of a SAS programmer in the Pharmaceutical industry has become very versatile over the years. SAS programmers are no longer just coding using SAS ® but also performing many other functions including writing programming specifications, annotating Case Report Forms (CRF) and mocks shells, preparing documents for a New Drug Application (NDA), etc. As such, knowledge of robotic features in other tools and editors can come in very handy. One such tool is Microsoft Excel, which is relatively easy to learn and widely used already. This paper will discuss how programming specifications that require repetitive code can be transformed into SAS code in seconds by taking advantage of different functions in Excel.

View paper.

Using SAS to enhance data sharing across REDCap projects: Reducing errors, streamlining management, and improving quality

Rachel Myers

With the increasing use of web-based data collection tools and systems, such as the widely used REDCap (Research Electronic Data Capture), there are opportunities to enhance our approach to data management, particularly related to data sharing across databases, projects, or teams. In situations requiring data sharing across studies or the need to restrict access to data to maintain blinding to treatment assignment, study teams often establish multiple project databases with different user permissions and selected data elements. For on-going data sharing, teams commonly use the manual export and import features of REDCap, which can be burdensome, time intensive, require careful attention to file formatting, and are error prone. Enabling REDCap API (Application Programming Interface) allows users to utilize SAS to securely share restricted or selected data across databases, eliminating the need for repeatedly conducting manual data exports, data management, and import of data files. In this paper, we will describe SAS code that permits the direct export of data from a REDCap database into SAS for data management and manipulation and subsequent import of selected data elements into a target REDCap database. This approach establishes a secure, replicable approach to data sharing that is particularly useful for prospective research and for projects requiring routine and on-going sharing of restricted data. Additionally, the use of the API parameters provides protection against overwriting existing data, reduces the risk of inadvertent disclosure of sensitive data, and provides a mechanism for maintaining records of all data sharing.

View paper.

Automating ADSL Programming Using Pinnacle 21 Specifications

Tracy Sherman and Aakar Shah

As the use of Pinnacle 21 ADaM define specifications increases, so does the opportunity for automation. ADaM programming can be automated by using the Pinnacle 21 specifications to determine the necessary variable data source, assignments, codelists and derivations. As each variable has a specific origin type designated in the specifications such as predecessor, assigned, derived, etc., the specifications can be tailored and drawn upon to write the applicable SAS code to a program.

This paper will demonstrate how Pinnacle 21 specifications for ADSL can be used to generate a program for producing an ADSL data set. Specification guidelines based on origin type will be proposed to help streamline the specification writing process and strengthen the ability to automate data set programming.

View paper.

From Readability to Responsible Risk Management: Facilitating the Automatic Identification and Aggregation of Software Technical Debt within an Organization Through Standardized Commenting in SAS® Program Files and SAS Enterprise Guide Project Files

Troy Martin Hughes

Software readability is greatly improved when programs include descriptive comments in a predictable, standardized format. Program headers that describe software requirements, author, creation date, versioning history, caveats, and other metadata are a common method to facilitate a greater understanding of software objectives, strengths, weaknesses, and prerequisites. Moreover, when program headers are standardized, they are not only more readable to developers but also to parsing algorithms that can automatically extract metadata for analysis or archival. Comments throughout software can also improve its readability and, when constructed in a standardized format, can be parsed automatically and saved in control tables. This text introduces a standardized commenting methodology that enables both qualitative and quantitative comments to be parsed from SAS® software headers and body. A configuration file defines comment formatting and content and provides a flexible, scalable, reusable, data-driven SAS macro-based solution. This text demonstrates one use case for this methodology in which software technical debt and risk are assessed via both qualitative (e.g., risk description, proposed risk resolution) and quantitative (e.g., risk severity, risk probability, likelihood of risk discovery, ease of risk mitigation) metadata and metrics included within SAS comments. The comment interpreter dynamically identifies and parses all SAS program files and SAS Enterprise Guide project files (including imbedded SAS programs therein) within one or more folders to produce a comprehensive, quantifiable risk register. This data-driven documentation, generated with push-button simplicity, enables SAS practitioners to better understand and make decisions about technical debt and risk, including at the program, project, developer, team, and organizational levels.

View paper.

Pipe it! Extract it! Build it!

Smitha Krishnamurthy and Sofia Shamas

A process improvement tool developed using FILENAME statement with PIPE option in SAS to create study related validation tracker with key information captured from the individual SAS program headers within a folder. The validation tracker is an excel document with hyperlinks to the actual RTF output and serves as a guideline for the developer and the validator during validation. Additionally, this macro creates a validation issue document in excel where we verify the accuracy of the program header by comparing the details of the RTF outputs listed against the actual outputs created. This macro does require the user to follow a standard header structure across all codes and execute it as a batch process. The use of pipe option limits the code use with SAS Enterprise Guide. Macro was run in SAS 9.4.

View paper.

Python and R made easy for the SAS Programmer

Janet Li and Varaprasad Ilapogu

Many of the day-to-day tasks and responsibilities of the statistical programmer of a pharmaceutical research and development group or contract research organization (CRO) include importing/exporting data, deriving variables and creating analysis data sets, and creating clinical study report (CSR) materials such as tables, listings, and figures (TLFs). These outputs are used for submission to regulatory agencies and are generally programmed in SAS. There is a growing movement and acceptance towards using Python and R in the Clinical SAS programming world. In our paper, we explore tips and tricks on how to use Python and R in conjunction with SAS to more effectively and efficiently deliver statistical programming outputs. We intend to provide examples of common SAS procedures and syntax that are used in the creation of analysis datasets and TLFs and translate them to Python and R to help the beginner Python and/or R programmer familiarize themselves with the alternative way of programming these statistical outputs.

View paper.

Using Jupyter to Take Your Data Science Workflow to the Next Level

Hunter Glanz

From state-of-the-art research to routine analytics, the Jupyter Notebook offers an unprecedented reporting medium. Historically, tables, graphics, and other types of output had to be created separately and then integrated into a report piece by piece, amidst the drafting of text. The Jupyter Notebook interface enables you to create code cells and markdown cells in any arrangement. Markdown cells allow all typical formatting. Code cells can run code in the document. As a result, report creation happens naturally and in a completely reproducible way. Handing a colleague a Jupyter Notebook file to be re-run or revised is much easier and simpler for them than passing along, at a minimum, two files:one for the code and one for the text. Traditional reports become dynamic documents that include both text and living SAS® code that is run during document creation. With the SAS kernel for Jupyter, you have the power to create these computational narratives and much more!

View paper.