Datasets Models

A deep-dive into data tables available for analysis via Destinations

Table & Column Referential

An exhaustive list of all tables and their respective columns, labeled by schema, can be found in this Google Sheet:

https://docs.google.com/spreadsheets/d/1gYGqKQP9Z71nz1lOb4y0wJ15i27NvsUzdliIJeIx0HY/edit#gid=175439716

Table Descriptions

Product

PRODUCT_BADGES - This table contains all the different badges' names and types we observed on your products on any given day since they were tracked.

PRODUCT_DETAILED - This table contains historical data of the daily product listings for all the products you track on our platform.

PRODUCT_EMS - EMS stands for Estimated Monthly Sales. This table contains our estimates for each product's daily/monthly sales, as well as a score and trust index that represent how reliable the estimates are expected to be.

PRODUCT_IMAGE_LINKS - This table contains the URLs of all images seen in each product snapshot. Each row contains two URLs: one for the large version of the image and one for the thumbnail.

PRODUCT_KEYWORD_RESEARCH - This table looks at all the DataHawk data from the past 7 days and counts the instances were your products were spotted across all keywords in DataHawk database. This allows you to quickly identify relevant keywords that are worth tracking.

PRODUCT_LISTING_QUALITY_SCORE - This table shows your tracked products' daily Listing Quality Score. This proprietary KPI scans your product listings to give you an elaborate score.

PRODUCT_LISTING_QUALITY_SCORE_BREAKDOWN - This table shows the breakdown of each product's listing quality score per component of the score.

PRODUCT_SALES_RANK - This table contains an exhaustive record of the various browse nodes and sales ranks detected for your products on a daily basis ever since they were tracked.

PRODUCT_VARIATION_LATEST - This table shows the latest version of the variations of your tracked products.

SEO

SEO_KEYWORD_VOLUME - A homemade KPI that estimates the search volume of the keywords you are tracking per month.

SEO_PRODUCT_RANK_ORGANIC - Daily organic ranking (pages 1 to 5) of your tracked keywords and all visible data on the keyword page.

SEO_PRODUCT_RANK_SPONSORED - Daily sponsored ranking page of your tracked keywords along with all visible data on the keyword page.

Finance

FINANCE_ACCOUNT_METRICS - This table shows all the aggregate of successful orders that occurred for a certain account per day. It's a high-level aggregated dashboard to follow what's happening on an account level.

FINANCE_COGS - This table shows the Cost of Goods Sold of products on specific days. This is based on a spreadsheet that must be manually uploaded to DataHawk's in-app Profit & Loss dashboard.

FINANCE_EVENTS - This table lists a daily breakdown of financial events, categorized by account key and marketplace. Some financial events may include order id and ASIN information. The metric column indicates the amount associated with each Amazon Financial Event. It is a comprehensive table designed to provide detailed insights, even into reimbursed gift wrap events.

FINANCE_ORDERS - This table shows a daily record of all the orders (successful & canceled) that occurred for a specific account. Unlike the FINANCE_ACCOUNT_METRICS table, this data is not aggregated and is presented at an order level. The table features informative columns such as PURCHASE_TIME, displaying the precise time at which each order was processed.

FINANCE_ORDER_SUCCESS - This table shows a daily record of successful orders that took place for a specific account. It is essentially the same as the FINANCE_ORDERS table, with the key difference being that it is filtered to display only the successful orders, as denoted by a status of "Shipped."

FINANCE_PRODUCT_METRICS_DAILY - This table aggregates orders on a product level daily. It's built to follow the performance of a product every day: number of orders, number of units sold, and sales.

FINANCE_PRODUCT_METRICS_MONTHLY - This table aggregates orders on a product level monthly. It's built to follow the performance of a product every month: number of orders, number of units sold, and sales.

FINANCE_PROFIT_LOSS - This table aggregates all profit & loss metrics and their amount per account key and day.

Advertising

ADVERTISING_ACCOUNT_METRICS - Advertising data aggregated by account and date and sponsoring type.

ADVERTISING_CAMPAIGN_METRICS - Advertising data aggregated by account, campaign, date, and sponsoring type.

ADVERTISING_PRODUCT_CAMPAIGN_METRICS - Advertising data aggregated by account, campaign, product, date, and sponsoring type.

ADVERTISING_PRODUCT_METRICS - Advertising data aggregated by account, date, product, and sponsoring type.

ADVERTISING_KEYWORD_METRICS - Advertising data aggregated by account, keyword, date, and sponsoring type.

Market

MARKET_BEST_SELLER_RANK - This table shows products' sales ranks within specific tracked categories. Unlike PRODUCT_SALES_RANK, this table shows the ranks of products within your tracked categories, and not within your tracked products.

MARKET_BROWSE_NODE_SALES_ESTIMATES - This table shows our estimates of total monthly units sold per tracked category.

MARKET_PRODUCT_SALES_ESTIMATES - This table shows products' sales estimates within specific tracked categories. Unlike PRODUCT_EMS, this table shows the product sales within your tracked categories, and not within your tracked products.

Reports

REPORT_MARKET_PRODUCT_SALES_RANK_AND_ESTIMATES - A consolidated view of both the Sales Ranks and Sales Estimates of products in tracked Categories. This essentially joins MARKET_BEST_SELLER_RANK and MARKET_PRODUCT_SALES_ESTIMATES. This table contains the brand and brand_catalog columns, which refer to each product's brand obtained from two different sources (Amazon Website and API, respectively).

REPORT_PRODUCT_LATEST_VERSION - This table contains key data (Name, Brand, Price, etc.) on all products that appear across all different schema. This means not only tracked products, but also products from SEO, Finance, Advertising, and other data domains are included. This table contains the brand and brand_catalog columns, which refer to each product's brand obtained from two different sources (Amazon Website and API, respectively).

REPORT_PRODUCT_SEARCH_RESULTS_BADGES_HISTORY - This table contains historical data on product badges seen in the search results pages for specific tracked keywords, such as the Amazon's Choice badge.

REPORT_AMAZONS_CHOICE_BADGE_HISTORY - This table contains a complete historical record of all instances where the Amazon's Choice badge was detected on a product. The source column specifies whether these badges were observed on a keyword's search results page or the product's own page.

REPORT_SEO_PRODUCT_ORGANIC_RANK - This table is similar to SEO_PRODUCT_RANK_ORGANIC but with additional product data such as the product's name, brand, and price. This table also includes the brand and brand_catalog columns, which denote the brand of each product obtained from two different sources (Amazon Website and API, respectively).

REPORT_SEO_PRODUCT_SPONSORED_RANK - This table is similar to SEO_PRODUCT_RANK_SPONSORED, but with additional product data such as product's name, brand, and price, etc. This table also includes the brand and brand_catalog columns, which denote the brand of each product obtained from two different sources (Amazon Website and API, respectively).

Raw Inventory

RAW_INVENTORY_FBA - This table shows the most recent snapshot of each product's inventory status and breakdown across Amazon's fulfillment network.

RAW_INVENTORY_FBA_HISTORY - This table shows historical snapshots of each product's inventory status and breakdown across Amazon's fulfillment network.

RAW_INVENTORY_LEDGER_SUMMARY - This table shows daily aggregates of inventory movement/events per product.

RAW_INVENTORY_LEDGER_DETAIL - This table shows all historical events pertaining to inventory movement per product.

RAW_INVENTORY_RESTOCK_RECOMMENDATIONS - This table shows the most recent snapshot restock recommendations generated by Amazon per product.

RAW_INVENTORY_RESTOCK_RECOMMENDATIONS - This table shows historical snapshots of restock recommendations generated by Amazon per product.

Referential

REFERENTIAL_ADVERTISING_ACCOUNT - This table lists all the Advertising accounts you connected to DataHawk and general information about these accounts. Connecting these accounts will populate everything under ADVERTISING.

REFERENTIAL_BROWSE_NODE_TRACKED - List of browse nodes (categories) you are currently tracking. A browse node is defined by a name (browse_node_name) and an id (browse_node_id).

REFERENTIAL_BROWSE_NODE_TREE - This table is our referential of all browse nodes on Amazon and Walmart and their hierarchies/paths. You can find information such as the path, the node level, the name, or the last observation date.

REFERENTIAL_CURRENCY_RATE - This table is a matrix of dates and currency rates. Since the first of January 2016, you have a currency, a base, and a rate for every date; the base is always USD.

REFERENTIAL_FINANCE_ACCOUNT - This table lists all the Seller Central (Finance) accounts you connected to DataHawk. Connecting these accounts will populate everything under FINANCE.

REFERENTIAL_KEYWORD_TAG - This table lists all the keywords tracked and the corresponding tags you've put on DataHawk.

REFERENTIAL_KEYWORD_TRACKED - This table lists all the keywords you tracked on DataHawk and information about when you started tracking them.

REFERENTIAL_MARKETPLACE - This is the list of all covered marketplaces and relative information about them.

REFERENTIAL_PRODUCT_TAG - This table lists all the products tracked and the corresponding tags you have assigned on DataHawk.

REFERENTIAL_PRODUCT_TRACKED - This table lists all the products you are currently tracking on DataHawk. If you want to enable the PRODUCT tables, you need to have tracked products on the platform beforehand.

REFERENTIAL_PROJECT - All the tags you've created on DataHawk and the keywords and products within each of them.

Selling Partner

BA_MARKET_BASKET_SRC- Data on the items that are most commonly purchased in combination with the items in the customer's basket (cart) at checkout.

BA_REPEAT_PURCHASE_SRC - Data on the quantity of repeated purchases of the selling partner's items.

BA_SEARCH_TERMS_SRC - Data on the top clicked ASINs by search keyword and department for a marketplace.

FBA_CUSTOMER_RETURNS_SRC - Data on customer returned items received at an Amazon fulfillment center, including Return Reason and Disposition.

FBA_SALES_FULFILLED_SHIPMENTS_SRC - Data on order/shipment/item information including price, courier, and tracking data

FBA_SNS_FORECAST_SRC - Eight-week forecast of sales based on existing subscriptions and sales history.

FBA_SNS_PERFORMANCE_SRC- Trailing four-week view of items shipped, percentage of orders missed due to being out of stock, average post-discount sale price, and average discount for each SKU.

SELLER_RETAIL_TRAFFIC_BY_ASIN_SRC - In-depth daily breakdown of page views and sessions data per product.

SELLER_RETAIL_TRAFFIC_BY_SKU_SRC - In-depth daily breakdown of page views and sessions data per product SKU.

SELLER_RETAIL_TRAFFIC_BY_PARENT_SRC - In-depth daily breakdown of page views and sessions data per parent ASIN.

VENDOR_SALES_MANUFACTURING_SRC - Sales data including Vendor's Revenue (Shipped COGS), Amazon's Revenue (Sales), Units Sold, and Returns. Uses the Manufacturing view, which includes all products manufactured by the vendor.

VENDOR_SALES_SOURCING_SRC - Sales data including Vendor's Revenue (Shipped COGS), Amazon's Revenue (Sales), Units Sold, and Returns. Uses the Sourcing view, which only includes products sourced from the vendor's warehouse.

VENDOR_INVENTORY_MANUFACTURING_SRC - In-Depth Daily Inventory Breakdown. Uses the Manufacturing view, which includes all products manufactured by the vendor.

VENDOR_SALES_SOURCING_SRC - In-Depth Daily Inventory Breakdown. Uses the Sourcing view, which only includes products sourced from the vendor's warehouse.

VENDOR_FORECASTING_SRC - Demand Forecasts per ASIN provided on a weekly basis. Includes Mean, P70, P80, and P90 forecasts. A P70 forecast means that if the Vendor stocks up on the recommended about, there is a 70% chance of avoiding a shortage.

VENDOR_NET_PURE_PRODUCT_MARGIN_SRC - Net Pure Margins per Product, indicating profitability from Amazon's perspective.

VENDOR_TRAFFIC_SRC - Daily Page Views per Product

Usage

USAGE_CREDITS_REMAINING - This shows your quota of Snowflake credits, how much has been used, and how much remains.

USAGE_CREDIT_CONSUMPTION_HISTORY - This hows historical credit usage, broken-down by compute vs. cloud services.

USAGE_DESTINATION_CREATION_HISTORY - This shows the full history of destination creations by all users linked to the workspace.

USAGE_QUERY_HISTORY - This shows the full history of queries run across different destinations.

Preview

Schemas labeled as "Preview" contain temporary tables whose form and content you can expect to change occasionally. As of April 2023, the following data are included that may be of interest:

  • Vendor Data (PREVIEW_RAW_VENDOR)
    • Sales
    • Inventory
    • Traffic
    • Product Margins
    • Demand Forecasts
  • Seller Data (PREVIEW_RAW_SELLER)
    • Traffic & Sessions
  • Others (PREVIEW)
    • Product Listing Quality Analysis
    • Product Attributes
    • Units Bought as seen on Amazon

Frequently Asked Questions

How can I see data on sessions and page views of my products?

Currently, this data is available as a preview in the PREVIEW_RAW_SELLER schema. Here, you'll find data on sessions and page views broken down by platform (browser vs. mobile app). This data is available at SKU-level, Child ASIN-level, and Parent ASIN-level.

How can I check which brands dominate a specific category?

Upon tracking a category, the table REPORTS.REPORT_MARKET_PRODUCT_SALES_RANK_AND_ESTIMATES will contain this data within both the brand and brand_catalog columns.

How can I check which brands dominate search results for a specific keyword?

Upon tracking a keyword, the tables REPORTS.REPORT_SEO_PRODUCT_ORGANIC_RANK and REPORTS.REPORT_SEO_PRODUCT_SPONSORED_RANK will contain this data within both the brand and brand_catalog columns.

Why do some of these tables contain two brand columns?

We gather brand data from two distinct sources - directly from the Amazon website and via their API. It is worth noting that brand names may vary between these two sources, and thus, we present both to provide more exhaustive information to the user.

What if I want to analyze brand data among products within my other data, like Inventory or Advertising?

Simply refer to REPORTS.REPORT_PRODUCT_LATEST_VERSION. This table contains brand data, as well as other basic data points, on all your products.

Why are some schemas labeled "Raw"?

The "Raw" label means that this data is exactly what we have collected from Amazon's API's, with very minimal transformation on DataHawk's side.

Why are some schemas labeled "Preview"?

The "Preview" label means that this data is currently in the works, and subject to change in terms of form and content. However, we believe that this data can already provide value to you despite being a work-in-progress.

I want to access this data in a more raw/unaggregated format. Would this be possible?

In most cases, yes! If you would like to request this type of access, please reach out to us, and we will do our best to accommodate your request.