generic: add 8139cp fixes, enable hardware csum/tso on 4.0+

This contains two sets of fixes for the 8139cp driver.

For all kernel versions older than 4.3, we can apply the fixes from the
4.3-rc4 kernel. In particular, these fix the TX timeout recovery which
is causing my Geos to lock up until the hardware watchdog kicks in.

For 4.0 and later kernels, we can also apply the additional improvements
which are going into 4.4 to fix and enable hardware checksum/TSO
offload. Backporting those to older kernels is non-trivial.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

SVN-Revision: 47220
This commit is contained in:
John Crispin 2015-10-19 10:09:54 +00:00
parent 1c74d046ed
commit 024083a556
6 changed files with 1416 additions and 0 deletions

View file

@ -0,0 +1,367 @@
commit 41b976414c88016e2c9d9b2f6667ee67a998d388
Author: David Woodhouse <David.Woodhouse@intel.com>
Date: Wed Sep 23 09:45:31 2015 +0100
8139cp: Dump contents of descriptor ring on TX timeout
We are seeing unexplained TX timeouts under heavy load. Let's try to get
a better idea of what's going on.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 7f4c685633e2df9ba10d49a31dda13715745db37
Author: David Woodhouse <David.Woodhouse@intel.com>
Date: Wed Sep 23 09:45:16 2015 +0100
8139cp: Fix DMA unmapping of transmitted buffers
The low 16 bits of the 'opts1' field in the TX descriptor are supposed
to still contain the buffer length when the descriptor is handed back to
us. In practice, at least on my hardware, they don't. So stash the
original value of the opts1 field and get the length to unmap from
there.
There are other ways we could have worked out the length, but I actually
want a stash of the opts1 field anyway so that I can dump it alongside
the contents of the descriptor ring when we suffer a TX timeout.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 0a5aeee0b79fa99d8e04c98dd4e87d4f52aa497b
Author: David Woodhouse <David.Woodhouse@intel.com>
Date: Wed Sep 23 09:44:57 2015 +0100
8139cp: Reduce duplicate csum/tso code in cp_start_xmit()
We calculate the value of the opts1 descriptor field in three different
places. With two different behaviours when given an invalid packet to
be checksummed — none of them correct. Sort that out.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit a3b804043f490aeec57d8ca5baccdd35e6250857
Author: David Woodhouse <David.Woodhouse@intel.com>
Date: Wed Sep 23 09:44:38 2015 +0100
8139cp: Fix TSO/scatter-gather descriptor setup
When sending a TSO frame in multiple buffers, we were neglecting to set
the first descriptor up in TSO mode.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 26b0bad6ac3a0167792dc4ffb276c29bc597d239
Author: David Woodhouse <David.Woodhouse@intel.com>
Date: Wed Sep 23 09:44:06 2015 +0100
8139cp: Fix tx_queued debug message to print correct slot numbers
After a certain amount of staring at the debug output of this driver, I
realised it was lying to me.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit aaa0062ecf4877a26dea66bee1039c6eaf906c94
Author: David Woodhouse <David.Woodhouse@intel.com>
Date: Wed Sep 23 09:43:41 2015 +0100
8139cp: Do not re-enable RX interrupts in cp_tx_timeout()
If an RX interrupt was already received but NAPI has not yet run when
the RX timeout happens, we end up in cp_tx_timeout() with RX interrupts
already disabled. Blindly re-enabling them will cause an IRQ storm.
(This is made particularly horrid by the fact that cp_interrupt() always
returns that it's handled the interrupt, even when it hasn't actually
done anything. If it didn't do that, the core IRQ code would have
detected the storm and handled it, I'd have had a clear smoking gun
backtrace instead of just a spontaneously resetting router, and I'd have
at *least* two days of my life back. Changing the return value of
cp_interrupt() will be argued about under separate cover.)
Unconditionally leave RX interrupts disabled after the reset, and
schedule NAPI to check the receive ring and re-enable them.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 7a8a8e75d505147358b225173e890ada43a267e2
Author: David Woodhouse <dwmw2@infradead.org>
Date: Fri Sep 18 00:21:54 2015 +0100
8139cp: Call __cp_set_rx_mode() from cp_tx_timeout()
Unless we reset the RX config, on real hardware I don't seem to receive
any packets after a TX timeout.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit fc27bd115b334e3ebdc682a42a47c3aea2566dcc
Author: David Woodhouse <dwmw2@infradead.org>
Date: Fri Sep 18 00:19:08 2015 +0100
8139cp: Use dev_kfree_skb_any() instead of dev_kfree_skb() in cp_clean_rings()
This can be called from cp_tx_timeout() with interrupts disabled.
Spotted by Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/drivers/net/ethernet/realtek/8139cp.c b/drivers/net/ethernet/realtek/8139cp.c
index d79e33b..686334f 100644
--- a/drivers/net/ethernet/realtek/8139cp.c
+++ b/drivers/net/ethernet/realtek/8139cp.c
@@ -157,6 +157,7 @@ enum {
NWayAdvert = 0x66, /* MII ADVERTISE */
NWayLPAR = 0x68, /* MII LPA */
NWayExpansion = 0x6A, /* MII Expansion */
+ TxDmaOkLowDesc = 0x82, /* Low 16 bit address of a Tx descriptor. */
Config5 = 0xD8, /* Config5 */
TxPoll = 0xD9, /* Tell chip to check Tx descriptors for work */
RxMaxSize = 0xDA, /* Max size of an Rx packet (8169 only) */
@@ -341,6 +342,7 @@ struct cp_private {
unsigned tx_tail;
struct cp_desc *tx_ring;
struct sk_buff *tx_skb[CP_TX_RING_SIZE];
+ u32 tx_opts[CP_TX_RING_SIZE];
unsigned rx_buf_sz;
unsigned wol_enabled : 1; /* Is Wake-on-LAN enabled? */
@@ -665,7 +667,7 @@ static void cp_tx (struct cp_private *cp)
BUG_ON(!skb);
dma_unmap_single(&cp->pdev->dev, le64_to_cpu(txd->addr),
- le32_to_cpu(txd->opts1) & 0xffff,
+ cp->tx_opts[tx_tail] & 0xffff,
PCI_DMA_TODEVICE);
if (status & LastFrag) {
@@ -733,7 +735,7 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
{
struct cp_private *cp = netdev_priv(dev);
unsigned entry;
- u32 eor, flags;
+ u32 eor, opts1;
unsigned long intr_flags;
__le32 opts2;
int mss = 0;
@@ -753,6 +755,21 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
mss = skb_shinfo(skb)->gso_size;
opts2 = cpu_to_le32(cp_tx_vlan_tag(skb));
+ opts1 = DescOwn;
+ if (mss)
+ opts1 |= LargeSend | ((mss & MSSMask) << MSSShift);
+ else if (skb->ip_summed == CHECKSUM_PARTIAL) {
+ const struct iphdr *ip = ip_hdr(skb);
+ if (ip->protocol == IPPROTO_TCP)
+ opts1 |= IPCS | TCPCS;
+ else if (ip->protocol == IPPROTO_UDP)
+ opts1 |= IPCS | UDPCS;
+ else {
+ WARN_ONCE(1,
+ "Net bug: asked to checksum invalid Legacy IP packet\n");
+ goto out_dma_error;
+ }
+ }
if (skb_shinfo(skb)->nr_frags == 0) {
struct cp_desc *txd = &cp->tx_ring[entry];
@@ -768,31 +785,20 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
txd->addr = cpu_to_le64(mapping);
wmb();
- flags = eor | len | DescOwn | FirstFrag | LastFrag;
-
- if (mss)
- flags |= LargeSend | ((mss & MSSMask) << MSSShift);
- else if (skb->ip_summed == CHECKSUM_PARTIAL) {
- const struct iphdr *ip = ip_hdr(skb);
- if (ip->protocol == IPPROTO_TCP)
- flags |= IPCS | TCPCS;
- else if (ip->protocol == IPPROTO_UDP)
- flags |= IPCS | UDPCS;
- else
- WARN_ON(1); /* we need a WARN() */
- }
+ opts1 |= eor | len | FirstFrag | LastFrag;
- txd->opts1 = cpu_to_le32(flags);
+ txd->opts1 = cpu_to_le32(opts1);
wmb();
cp->tx_skb[entry] = skb;
- entry = NEXT_TX(entry);
+ cp->tx_opts[entry] = opts1;
+ netif_dbg(cp, tx_queued, cp->dev, "tx queued, slot %d, skblen %d\n",
+ entry, skb->len);
} else {
struct cp_desc *txd;
- u32 first_len, first_eor;
+ u32 first_len, first_eor, ctrl;
dma_addr_t first_mapping;
int frag, first_entry = entry;
- const struct iphdr *ip = ip_hdr(skb);
/* We must give this initial chunk to the device last.
* Otherwise we could race with the device.
@@ -805,14 +811,14 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
goto out_dma_error;
cp->tx_skb[entry] = skb;
- entry = NEXT_TX(entry);
for (frag = 0; frag < skb_shinfo(skb)->nr_frags; frag++) {
const skb_frag_t *this_frag = &skb_shinfo(skb)->frags[frag];
u32 len;
- u32 ctrl;
dma_addr_t mapping;
+ entry = NEXT_TX(entry);
+
len = skb_frag_size(this_frag);
mapping = dma_map_single(&cp->pdev->dev,
skb_frag_address(this_frag),
@@ -824,19 +830,7 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
eor = (entry == (CP_TX_RING_SIZE - 1)) ? RingEnd : 0;
- ctrl = eor | len | DescOwn;
-
- if (mss)
- ctrl |= LargeSend |
- ((mss & MSSMask) << MSSShift);
- else if (skb->ip_summed == CHECKSUM_PARTIAL) {
- if (ip->protocol == IPPROTO_TCP)
- ctrl |= IPCS | TCPCS;
- else if (ip->protocol == IPPROTO_UDP)
- ctrl |= IPCS | UDPCS;
- else
- BUG();
- }
+ ctrl = opts1 | eor | len;
if (frag == skb_shinfo(skb)->nr_frags - 1)
ctrl |= LastFrag;
@@ -849,8 +843,8 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
txd->opts1 = cpu_to_le32(ctrl);
wmb();
+ cp->tx_opts[entry] = ctrl;
cp->tx_skb[entry] = skb;
- entry = NEXT_TX(entry);
}
txd = &cp->tx_ring[first_entry];
@@ -858,27 +852,17 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
txd->addr = cpu_to_le64(first_mapping);
wmb();
- if (skb->ip_summed == CHECKSUM_PARTIAL) {
- if (ip->protocol == IPPROTO_TCP)
- txd->opts1 = cpu_to_le32(first_eor | first_len |
- FirstFrag | DescOwn |
- IPCS | TCPCS);
- else if (ip->protocol == IPPROTO_UDP)
- txd->opts1 = cpu_to_le32(first_eor | first_len |
- FirstFrag | DescOwn |
- IPCS | UDPCS);
- else
- BUG();
- } else
- txd->opts1 = cpu_to_le32(first_eor | first_len |
- FirstFrag | DescOwn);
+ ctrl = opts1 | first_eor | first_len | FirstFrag;
+ txd->opts1 = cpu_to_le32(ctrl);
wmb();
+
+ cp->tx_opts[first_entry] = ctrl;
+ netif_dbg(cp, tx_queued, cp->dev, "tx queued, slots %d-%d, skblen %d\n",
+ first_entry, entry, skb->len);
}
- cp->tx_head = entry;
+ cp->tx_head = NEXT_TX(entry);
netdev_sent_queue(dev, skb->len);
- netif_dbg(cp, tx_queued, cp->dev, "tx queued, slot %d, skblen %d\n",
- entry, skb->len);
if (TX_BUFFS_AVAIL(cp) <= (MAX_SKB_FRAGS + 1))
netif_stop_queue(dev);
@@ -1115,6 +1099,7 @@ static int cp_init_rings (struct cp_private *cp)
{
memset(cp->tx_ring, 0, sizeof(struct cp_desc) * CP_TX_RING_SIZE);
cp->tx_ring[CP_TX_RING_SIZE - 1].opts1 = cpu_to_le32(RingEnd);
+ memset(cp->tx_opts, 0, sizeof(cp->tx_opts));
cp_init_rings_index(cp);
@@ -1151,7 +1136,7 @@ static void cp_clean_rings (struct cp_private *cp)
desc = cp->rx_ring + i;
dma_unmap_single(&cp->pdev->dev,le64_to_cpu(desc->addr),
cp->rx_buf_sz, PCI_DMA_FROMDEVICE);
- dev_kfree_skb(cp->rx_skb[i]);
+ dev_kfree_skb_any(cp->rx_skb[i]);
}
}
@@ -1164,7 +1149,7 @@ static void cp_clean_rings (struct cp_private *cp)
le32_to_cpu(desc->opts1) & 0xffff,
PCI_DMA_TODEVICE);
if (le32_to_cpu(desc->opts1) & LastFrag)
- dev_kfree_skb(skb);
+ dev_kfree_skb_any(skb);
cp->dev->stats.tx_dropped++;
}
}
@@ -1172,6 +1157,7 @@ static void cp_clean_rings (struct cp_private *cp)
memset(cp->rx_ring, 0, sizeof(struct cp_desc) * CP_RX_RING_SIZE);
memset(cp->tx_ring, 0, sizeof(struct cp_desc) * CP_TX_RING_SIZE);
+ memset(cp->tx_opts, 0, sizeof(cp->tx_opts));
memset(cp->rx_skb, 0, sizeof(struct sk_buff *) * CP_RX_RING_SIZE);
memset(cp->tx_skb, 0, sizeof(struct sk_buff *) * CP_TX_RING_SIZE);
@@ -1249,7 +1235,7 @@ static void cp_tx_timeout(struct net_device *dev)
{
struct cp_private *cp = netdev_priv(dev);
unsigned long flags;
- int rc;
+ int rc, i;
netdev_warn(dev, "Transmit timeout, status %2x %4x %4x %4x\n",
cpr8(Cmd), cpr16(CpCmd),
@@ -1257,13 +1243,26 @@ static void cp_tx_timeout(struct net_device *dev)
spin_lock_irqsave(&cp->lock, flags);
+ netif_dbg(cp, tx_err, cp->dev, "TX ring head %d tail %d desc %x\n",
+ cp->tx_head, cp->tx_tail, cpr16(TxDmaOkLowDesc));
+ for (i = 0; i < CP_TX_RING_SIZE; i++) {
+ netif_dbg(cp, tx_err, cp->dev,
+ "TX slot %d @%p: %08x (%08x) %08x %llx %p\n",
+ i, &cp->tx_ring[i], le32_to_cpu(cp->tx_ring[i].opts1),
+ cp->tx_opts[i], le32_to_cpu(cp->tx_ring[i].opts2),
+ le64_to_cpu(cp->tx_ring[i].addr),
+ cp->tx_skb[i]);
+ }
+
cp_stop_hw(cp);
cp_clean_rings(cp);
rc = cp_init_rings(cp);
cp_start_hw(cp);
- cp_enable_irq(cp);
+ __cp_set_rx_mode(dev);
+ cpw16_f(IntrMask, cp_norx_intr_mask);
netif_wake_queue(dev);
+ napi_schedule(&cp->napi);
spin_unlock_irqrestore(&cp->lock, flags);
}

View file

@ -0,0 +1,367 @@
commit 41b976414c88016e2c9d9b2f6667ee67a998d388
Author: David Woodhouse <David.Woodhouse@intel.com>
Date: Wed Sep 23 09:45:31 2015 +0100
8139cp: Dump contents of descriptor ring on TX timeout
We are seeing unexplained TX timeouts under heavy load. Let's try to get
a better idea of what's going on.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 7f4c685633e2df9ba10d49a31dda13715745db37
Author: David Woodhouse <David.Woodhouse@intel.com>
Date: Wed Sep 23 09:45:16 2015 +0100
8139cp: Fix DMA unmapping of transmitted buffers
The low 16 bits of the 'opts1' field in the TX descriptor are supposed
to still contain the buffer length when the descriptor is handed back to
us. In practice, at least on my hardware, they don't. So stash the
original value of the opts1 field and get the length to unmap from
there.
There are other ways we could have worked out the length, but I actually
want a stash of the opts1 field anyway so that I can dump it alongside
the contents of the descriptor ring when we suffer a TX timeout.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 0a5aeee0b79fa99d8e04c98dd4e87d4f52aa497b
Author: David Woodhouse <David.Woodhouse@intel.com>
Date: Wed Sep 23 09:44:57 2015 +0100
8139cp: Reduce duplicate csum/tso code in cp_start_xmit()
We calculate the value of the opts1 descriptor field in three different
places. With two different behaviours when given an invalid packet to
be checksummed — none of them correct. Sort that out.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit a3b804043f490aeec57d8ca5baccdd35e6250857
Author: David Woodhouse <David.Woodhouse@intel.com>
Date: Wed Sep 23 09:44:38 2015 +0100
8139cp: Fix TSO/scatter-gather descriptor setup
When sending a TSO frame in multiple buffers, we were neglecting to set
the first descriptor up in TSO mode.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 26b0bad6ac3a0167792dc4ffb276c29bc597d239
Author: David Woodhouse <David.Woodhouse@intel.com>
Date: Wed Sep 23 09:44:06 2015 +0100
8139cp: Fix tx_queued debug message to print correct slot numbers
After a certain amount of staring at the debug output of this driver, I
realised it was lying to me.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit aaa0062ecf4877a26dea66bee1039c6eaf906c94
Author: David Woodhouse <David.Woodhouse@intel.com>
Date: Wed Sep 23 09:43:41 2015 +0100
8139cp: Do not re-enable RX interrupts in cp_tx_timeout()
If an RX interrupt was already received but NAPI has not yet run when
the RX timeout happens, we end up in cp_tx_timeout() with RX interrupts
already disabled. Blindly re-enabling them will cause an IRQ storm.
(This is made particularly horrid by the fact that cp_interrupt() always
returns that it's handled the interrupt, even when it hasn't actually
done anything. If it didn't do that, the core IRQ code would have
detected the storm and handled it, I'd have had a clear smoking gun
backtrace instead of just a spontaneously resetting router, and I'd have
at *least* two days of my life back. Changing the return value of
cp_interrupt() will be argued about under separate cover.)
Unconditionally leave RX interrupts disabled after the reset, and
schedule NAPI to check the receive ring and re-enable them.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 7a8a8e75d505147358b225173e890ada43a267e2
Author: David Woodhouse <dwmw2@infradead.org>
Date: Fri Sep 18 00:21:54 2015 +0100
8139cp: Call __cp_set_rx_mode() from cp_tx_timeout()
Unless we reset the RX config, on real hardware I don't seem to receive
any packets after a TX timeout.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit fc27bd115b334e3ebdc682a42a47c3aea2566dcc
Author: David Woodhouse <dwmw2@infradead.org>
Date: Fri Sep 18 00:19:08 2015 +0100
8139cp: Use dev_kfree_skb_any() instead of dev_kfree_skb() in cp_clean_rings()
This can be called from cp_tx_timeout() with interrupts disabled.
Spotted by Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/drivers/net/ethernet/realtek/8139cp.c b/drivers/net/ethernet/realtek/8139cp.c
index d79e33b..686334f 100644
--- a/drivers/net/ethernet/realtek/8139cp.c
+++ b/drivers/net/ethernet/realtek/8139cp.c
@@ -157,6 +157,7 @@ enum {
NWayAdvert = 0x66, /* MII ADVERTISE */
NWayLPAR = 0x68, /* MII LPA */
NWayExpansion = 0x6A, /* MII Expansion */
+ TxDmaOkLowDesc = 0x82, /* Low 16 bit address of a Tx descriptor. */
Config5 = 0xD8, /* Config5 */
TxPoll = 0xD9, /* Tell chip to check Tx descriptors for work */
RxMaxSize = 0xDA, /* Max size of an Rx packet (8169 only) */
@@ -341,6 +342,7 @@ struct cp_private {
unsigned tx_tail;
struct cp_desc *tx_ring;
struct sk_buff *tx_skb[CP_TX_RING_SIZE];
+ u32 tx_opts[CP_TX_RING_SIZE];
unsigned rx_buf_sz;
unsigned wol_enabled : 1; /* Is Wake-on-LAN enabled? */
@@ -665,7 +667,7 @@ static void cp_tx (struct cp_private *cp)
BUG_ON(!skb);
dma_unmap_single(&cp->pdev->dev, le64_to_cpu(txd->addr),
- le32_to_cpu(txd->opts1) & 0xffff,
+ cp->tx_opts[tx_tail] & 0xffff,
PCI_DMA_TODEVICE);
if (status & LastFrag) {
@@ -733,7 +735,7 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
{
struct cp_private *cp = netdev_priv(dev);
unsigned entry;
- u32 eor, flags;
+ u32 eor, opts1;
unsigned long intr_flags;
__le32 opts2;
int mss = 0;
@@ -753,6 +755,21 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
mss = skb_shinfo(skb)->gso_size;
opts2 = cpu_to_le32(cp_tx_vlan_tag(skb));
+ opts1 = DescOwn;
+ if (mss)
+ opts1 |= LargeSend | ((mss & MSSMask) << MSSShift);
+ else if (skb->ip_summed == CHECKSUM_PARTIAL) {
+ const struct iphdr *ip = ip_hdr(skb);
+ if (ip->protocol == IPPROTO_TCP)
+ opts1 |= IPCS | TCPCS;
+ else if (ip->protocol == IPPROTO_UDP)
+ opts1 |= IPCS | UDPCS;
+ else {
+ WARN_ONCE(1,
+ "Net bug: asked to checksum invalid Legacy IP packet\n");
+ goto out_dma_error;
+ }
+ }
if (skb_shinfo(skb)->nr_frags == 0) {
struct cp_desc *txd = &cp->tx_ring[entry];
@@ -768,31 +785,20 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
txd->addr = cpu_to_le64(mapping);
wmb();
- flags = eor | len | DescOwn | FirstFrag | LastFrag;
-
- if (mss)
- flags |= LargeSend | ((mss & MSSMask) << MSSShift);
- else if (skb->ip_summed == CHECKSUM_PARTIAL) {
- const struct iphdr *ip = ip_hdr(skb);
- if (ip->protocol == IPPROTO_TCP)
- flags |= IPCS | TCPCS;
- else if (ip->protocol == IPPROTO_UDP)
- flags |= IPCS | UDPCS;
- else
- WARN_ON(1); /* we need a WARN() */
- }
+ opts1 |= eor | len | FirstFrag | LastFrag;
- txd->opts1 = cpu_to_le32(flags);
+ txd->opts1 = cpu_to_le32(opts1);
wmb();
cp->tx_skb[entry] = skb;
- entry = NEXT_TX(entry);
+ cp->tx_opts[entry] = opts1;
+ netif_dbg(cp, tx_queued, cp->dev, "tx queued, slot %d, skblen %d\n",
+ entry, skb->len);
} else {
struct cp_desc *txd;
- u32 first_len, first_eor;
+ u32 first_len, first_eor, ctrl;
dma_addr_t first_mapping;
int frag, first_entry = entry;
- const struct iphdr *ip = ip_hdr(skb);
/* We must give this initial chunk to the device last.
* Otherwise we could race with the device.
@@ -805,14 +811,14 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
goto out_dma_error;
cp->tx_skb[entry] = skb;
- entry = NEXT_TX(entry);
for (frag = 0; frag < skb_shinfo(skb)->nr_frags; frag++) {
const skb_frag_t *this_frag = &skb_shinfo(skb)->frags[frag];
u32 len;
- u32 ctrl;
dma_addr_t mapping;
+ entry = NEXT_TX(entry);
+
len = skb_frag_size(this_frag);
mapping = dma_map_single(&cp->pdev->dev,
skb_frag_address(this_frag),
@@ -824,19 +830,7 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
eor = (entry == (CP_TX_RING_SIZE - 1)) ? RingEnd : 0;
- ctrl = eor | len | DescOwn;
-
- if (mss)
- ctrl |= LargeSend |
- ((mss & MSSMask) << MSSShift);
- else if (skb->ip_summed == CHECKSUM_PARTIAL) {
- if (ip->protocol == IPPROTO_TCP)
- ctrl |= IPCS | TCPCS;
- else if (ip->protocol == IPPROTO_UDP)
- ctrl |= IPCS | UDPCS;
- else
- BUG();
- }
+ ctrl = opts1 | eor | len;
if (frag == skb_shinfo(skb)->nr_frags - 1)
ctrl |= LastFrag;
@@ -849,8 +843,8 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
txd->opts1 = cpu_to_le32(ctrl);
wmb();
+ cp->tx_opts[entry] = ctrl;
cp->tx_skb[entry] = skb;
- entry = NEXT_TX(entry);
}
txd = &cp->tx_ring[first_entry];
@@ -858,27 +852,17 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
txd->addr = cpu_to_le64(first_mapping);
wmb();
- if (skb->ip_summed == CHECKSUM_PARTIAL) {
- if (ip->protocol == IPPROTO_TCP)
- txd->opts1 = cpu_to_le32(first_eor | first_len |
- FirstFrag | DescOwn |
- IPCS | TCPCS);
- else if (ip->protocol == IPPROTO_UDP)
- txd->opts1 = cpu_to_le32(first_eor | first_len |
- FirstFrag | DescOwn |
- IPCS | UDPCS);
- else
- BUG();
- } else
- txd->opts1 = cpu_to_le32(first_eor | first_len |
- FirstFrag | DescOwn);
+ ctrl = opts1 | first_eor | first_len | FirstFrag;
+ txd->opts1 = cpu_to_le32(ctrl);
wmb();
+
+ cp->tx_opts[first_entry] = ctrl;
+ netif_dbg(cp, tx_queued, cp->dev, "tx queued, slots %d-%d, skblen %d\n",
+ first_entry, entry, skb->len);
}
- cp->tx_head = entry;
+ cp->tx_head = NEXT_TX(entry);
netdev_sent_queue(dev, skb->len);
- netif_dbg(cp, tx_queued, cp->dev, "tx queued, slot %d, skblen %d\n",
- entry, skb->len);
if (TX_BUFFS_AVAIL(cp) <= (MAX_SKB_FRAGS + 1))
netif_stop_queue(dev);
@@ -1115,6 +1099,7 @@ static int cp_init_rings (struct cp_private *cp)
{
memset(cp->tx_ring, 0, sizeof(struct cp_desc) * CP_TX_RING_SIZE);
cp->tx_ring[CP_TX_RING_SIZE - 1].opts1 = cpu_to_le32(RingEnd);
+ memset(cp->tx_opts, 0, sizeof(cp->tx_opts));
cp_init_rings_index(cp);
@@ -1151,7 +1136,7 @@ static void cp_clean_rings (struct cp_private *cp)
desc = cp->rx_ring + i;
dma_unmap_single(&cp->pdev->dev,le64_to_cpu(desc->addr),
cp->rx_buf_sz, PCI_DMA_FROMDEVICE);
- dev_kfree_skb(cp->rx_skb[i]);
+ dev_kfree_skb_any(cp->rx_skb[i]);
}
}
@@ -1164,7 +1149,7 @@ static void cp_clean_rings (struct cp_private *cp)
le32_to_cpu(desc->opts1) & 0xffff,
PCI_DMA_TODEVICE);
if (le32_to_cpu(desc->opts1) & LastFrag)
- dev_kfree_skb(skb);
+ dev_kfree_skb_any(skb);
cp->dev->stats.tx_dropped++;
}
}
@@ -1172,6 +1157,7 @@ static void cp_clean_rings (struct cp_private *cp)
memset(cp->rx_ring, 0, sizeof(struct cp_desc) * CP_RX_RING_SIZE);
memset(cp->tx_ring, 0, sizeof(struct cp_desc) * CP_TX_RING_SIZE);
+ memset(cp->tx_opts, 0, sizeof(cp->tx_opts));
memset(cp->rx_skb, 0, sizeof(struct sk_buff *) * CP_RX_RING_SIZE);
memset(cp->tx_skb, 0, sizeof(struct sk_buff *) * CP_TX_RING_SIZE);
@@ -1249,7 +1235,7 @@ static void cp_tx_timeout(struct net_device *dev)
{
struct cp_private *cp = netdev_priv(dev);
unsigned long flags;
- int rc;
+ int rc, i;
netdev_warn(dev, "Transmit timeout, status %2x %4x %4x %4x\n",
cpr8(Cmd), cpr16(CpCmd),
@@ -1257,13 +1243,26 @@ static void cp_tx_timeout(struct net_device *dev)
spin_lock_irqsave(&cp->lock, flags);
+ netif_dbg(cp, tx_err, cp->dev, "TX ring head %d tail %d desc %x\n",
+ cp->tx_head, cp->tx_tail, cpr16(TxDmaOkLowDesc));
+ for (i = 0; i < CP_TX_RING_SIZE; i++) {
+ netif_dbg(cp, tx_err, cp->dev,
+ "TX slot %d @%p: %08x (%08x) %08x %llx %p\n",
+ i, &cp->tx_ring[i], le32_to_cpu(cp->tx_ring[i].opts1),
+ cp->tx_opts[i], le32_to_cpu(cp->tx_ring[i].opts2),
+ le64_to_cpu(cp->tx_ring[i].addr),
+ cp->tx_skb[i]);
+ }
+
cp_stop_hw(cp);
cp_clean_rings(cp);
rc = cp_init_rings(cp);
cp_start_hw(cp);
- cp_enable_irq(cp);
+ __cp_set_rx_mode(dev);
+ cpw16_f(IntrMask, cp_norx_intr_mask);
netif_wake_queue(dev);
+ napi_schedule_irqoff(&cp->napi);
spin_unlock_irqrestore(&cp->lock, flags);
}

View file

@ -0,0 +1,105 @@
commit 8b7a7048220f86547db31de0abe1ea6dd2cfa892
Author: David Woodhouse <dwmw2@infradead.org>
Date: Thu Sep 24 11:38:22 2015 +0100
8139cp: Fix GSO MSS handling
When fixing the TSO support I noticed we just mask ->gso_size with the
MSSMask value and don't care about the consequences.
Provide a .ndo_features_check() method which drops the NETIF_F_TSO
feature for any skb which would exceed the maximum, and thus forces it
to be segmented by software.
Then we can stop the masking in cp_start_xmit(), and just WARN if the
maximum is exceeded, which should now never happen.
Finally, Francois Romieu noticed that we didn't even have the right
value for MSSMask anyway; it should be 0x7ff (11 bits) not 0xfff.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 5a58f227790faded5a3ef6075f3ddd65093e0f86
Author: David Woodhouse <David.Woodhouse@intel.com>
Date: Wed Sep 23 09:46:09 2015 +0100
8139cp: Enable offload features by default
I fixed TSO. Hardware checksum and scatter/gather also appear to be
working correctly both on real hardware and in QEMU's emulation.
Let's enable them by default and see if anyone screams...
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/drivers/net/ethernet/realtek/8139cp.c b/drivers/net/ethernet/realtek/8139cp.c
index 686334f..deae10d 100644
--- a/drivers/net/ethernet/realtek/8139cp.c
+++ b/drivers/net/ethernet/realtek/8139cp.c
@@ -175,7 +175,7 @@ enum {
LastFrag = (1 << 28), /* Final segment of a packet */
LargeSend = (1 << 27), /* TCP Large Send Offload (TSO) */
MSSShift = 16, /* MSS value position */
- MSSMask = 0xfff, /* MSS value: 11 bits */
+ MSSMask = 0x7ff, /* MSS value: 11 bits */
TxError = (1 << 23), /* Tx error summary */
RxError = (1 << 20), /* Rx error summary */
IPCS = (1 << 18), /* Calculate IP checksum */
@@ -754,10 +754,16 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
eor = (entry == (CP_TX_RING_SIZE - 1)) ? RingEnd : 0;
mss = skb_shinfo(skb)->gso_size;
+ if (mss > MSSMask) {
+ WARN_ONCE(1, "Net bug: GSO size %d too large for 8139CP\n",
+ mss);
+ goto out_dma_error;
+ }
+
opts2 = cpu_to_le32(cp_tx_vlan_tag(skb));
opts1 = DescOwn;
if (mss)
- opts1 |= LargeSend | ((mss & MSSMask) << MSSShift);
+ opts1 |= LargeSend | (mss << MSSShift);
else if (skb->ip_summed == CHECKSUM_PARTIAL) {
const struct iphdr *ip = ip_hdr(skb);
if (ip->protocol == IPPROTO_TCP)
@@ -1852,6 +1858,15 @@ static void cp_set_d3_state (struct cp_private *cp)
pci_set_power_state (cp->pdev, PCI_D3hot);
}
+static netdev_features_t cp_features_check(struct sk_buff *skb,
+ struct net_device *dev,
+ netdev_features_t features)
+{
+ if (skb_shinfo(skb)->gso_size > MSSMask)
+ features &= ~NETIF_F_TSO;
+
+ return vlan_features_check(skb, features);
+}
static const struct net_device_ops cp_netdev_ops = {
.ndo_open = cp_open,
.ndo_stop = cp_close,
@@ -1864,6 +1879,7 @@ static const struct net_device_ops cp_netdev_ops = {
.ndo_tx_timeout = cp_tx_timeout,
.ndo_set_features = cp_set_features,
.ndo_change_mtu = cp_change_mtu,
+ .ndo_features_check = cp_features_check,
#ifdef CONFIG_NET_POLL_CONTROLLER
.ndo_poll_controller = cp_poll_controller,
@@ -1983,12 +1999,12 @@ static int cp_init_one (struct pci_dev *pdev, const struct pci_device_id *ent)
dev->ethtool_ops = &cp_ethtool_ops;
dev->watchdog_timeo = TX_TIMEOUT;
- dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX;
+ dev->features |= NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO |
+ NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX;
if (pci_using_dac)
dev->features |= NETIF_F_HIGHDMA;
- /* disabled by default until verified */
dev->hw_features |= NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO |
NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX;
dev->vlan_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO |

View file

@ -0,0 +1,367 @@
commit 41b976414c88016e2c9d9b2f6667ee67a998d388
Author: David Woodhouse <David.Woodhouse@intel.com>
Date: Wed Sep 23 09:45:31 2015 +0100
8139cp: Dump contents of descriptor ring on TX timeout
We are seeing unexplained TX timeouts under heavy load. Let's try to get
a better idea of what's going on.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 7f4c685633e2df9ba10d49a31dda13715745db37
Author: David Woodhouse <David.Woodhouse@intel.com>
Date: Wed Sep 23 09:45:16 2015 +0100
8139cp: Fix DMA unmapping of transmitted buffers
The low 16 bits of the 'opts1' field in the TX descriptor are supposed
to still contain the buffer length when the descriptor is handed back to
us. In practice, at least on my hardware, they don't. So stash the
original value of the opts1 field and get the length to unmap from
there.
There are other ways we could have worked out the length, but I actually
want a stash of the opts1 field anyway so that I can dump it alongside
the contents of the descriptor ring when we suffer a TX timeout.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 0a5aeee0b79fa99d8e04c98dd4e87d4f52aa497b
Author: David Woodhouse <David.Woodhouse@intel.com>
Date: Wed Sep 23 09:44:57 2015 +0100
8139cp: Reduce duplicate csum/tso code in cp_start_xmit()
We calculate the value of the opts1 descriptor field in three different
places. With two different behaviours when given an invalid packet to
be checksummed — none of them correct. Sort that out.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit a3b804043f490aeec57d8ca5baccdd35e6250857
Author: David Woodhouse <David.Woodhouse@intel.com>
Date: Wed Sep 23 09:44:38 2015 +0100
8139cp: Fix TSO/scatter-gather descriptor setup
When sending a TSO frame in multiple buffers, we were neglecting to set
the first descriptor up in TSO mode.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 26b0bad6ac3a0167792dc4ffb276c29bc597d239
Author: David Woodhouse <David.Woodhouse@intel.com>
Date: Wed Sep 23 09:44:06 2015 +0100
8139cp: Fix tx_queued debug message to print correct slot numbers
After a certain amount of staring at the debug output of this driver, I
realised it was lying to me.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit aaa0062ecf4877a26dea66bee1039c6eaf906c94
Author: David Woodhouse <David.Woodhouse@intel.com>
Date: Wed Sep 23 09:43:41 2015 +0100
8139cp: Do not re-enable RX interrupts in cp_tx_timeout()
If an RX interrupt was already received but NAPI has not yet run when
the RX timeout happens, we end up in cp_tx_timeout() with RX interrupts
already disabled. Blindly re-enabling them will cause an IRQ storm.
(This is made particularly horrid by the fact that cp_interrupt() always
returns that it's handled the interrupt, even when it hasn't actually
done anything. If it didn't do that, the core IRQ code would have
detected the storm and handled it, I'd have had a clear smoking gun
backtrace instead of just a spontaneously resetting router, and I'd have
at *least* two days of my life back. Changing the return value of
cp_interrupt() will be argued about under separate cover.)
Unconditionally leave RX interrupts disabled after the reset, and
schedule NAPI to check the receive ring and re-enable them.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 7a8a8e75d505147358b225173e890ada43a267e2
Author: David Woodhouse <dwmw2@infradead.org>
Date: Fri Sep 18 00:21:54 2015 +0100
8139cp: Call __cp_set_rx_mode() from cp_tx_timeout()
Unless we reset the RX config, on real hardware I don't seem to receive
any packets after a TX timeout.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit fc27bd115b334e3ebdc682a42a47c3aea2566dcc
Author: David Woodhouse <dwmw2@infradead.org>
Date: Fri Sep 18 00:19:08 2015 +0100
8139cp: Use dev_kfree_skb_any() instead of dev_kfree_skb() in cp_clean_rings()
This can be called from cp_tx_timeout() with interrupts disabled.
Spotted by Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/drivers/net/ethernet/realtek/8139cp.c b/drivers/net/ethernet/realtek/8139cp.c
index d79e33b..686334f 100644
--- a/drivers/net/ethernet/realtek/8139cp.c
+++ b/drivers/net/ethernet/realtek/8139cp.c
@@ -157,6 +157,7 @@ enum {
NWayAdvert = 0x66, /* MII ADVERTISE */
NWayLPAR = 0x68, /* MII LPA */
NWayExpansion = 0x6A, /* MII Expansion */
+ TxDmaOkLowDesc = 0x82, /* Low 16 bit address of a Tx descriptor. */
Config5 = 0xD8, /* Config5 */
TxPoll = 0xD9, /* Tell chip to check Tx descriptors for work */
RxMaxSize = 0xDA, /* Max size of an Rx packet (8169 only) */
@@ -341,6 +342,7 @@ struct cp_private {
unsigned tx_tail;
struct cp_desc *tx_ring;
struct sk_buff *tx_skb[CP_TX_RING_SIZE];
+ u32 tx_opts[CP_TX_RING_SIZE];
unsigned rx_buf_sz;
unsigned wol_enabled : 1; /* Is Wake-on-LAN enabled? */
@@ -665,7 +667,7 @@ static void cp_tx (struct cp_private *cp)
BUG_ON(!skb);
dma_unmap_single(&cp->pdev->dev, le64_to_cpu(txd->addr),
- le32_to_cpu(txd->opts1) & 0xffff,
+ cp->tx_opts[tx_tail] & 0xffff,
PCI_DMA_TODEVICE);
if (status & LastFrag) {
@@ -733,7 +735,7 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
{
struct cp_private *cp = netdev_priv(dev);
unsigned entry;
- u32 eor, flags;
+ u32 eor, opts1;
unsigned long intr_flags;
__le32 opts2;
int mss = 0;
@@ -753,6 +755,21 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
mss = skb_shinfo(skb)->gso_size;
opts2 = cpu_to_le32(cp_tx_vlan_tag(skb));
+ opts1 = DescOwn;
+ if (mss)
+ opts1 |= LargeSend | ((mss & MSSMask) << MSSShift);
+ else if (skb->ip_summed == CHECKSUM_PARTIAL) {
+ const struct iphdr *ip = ip_hdr(skb);
+ if (ip->protocol == IPPROTO_TCP)
+ opts1 |= IPCS | TCPCS;
+ else if (ip->protocol == IPPROTO_UDP)
+ opts1 |= IPCS | UDPCS;
+ else {
+ WARN_ONCE(1,
+ "Net bug: asked to checksum invalid Legacy IP packet\n");
+ goto out_dma_error;
+ }
+ }
if (skb_shinfo(skb)->nr_frags == 0) {
struct cp_desc *txd = &cp->tx_ring[entry];
@@ -768,31 +785,20 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
txd->addr = cpu_to_le64(mapping);
wmb();
- flags = eor | len | DescOwn | FirstFrag | LastFrag;
-
- if (mss)
- flags |= LargeSend | ((mss & MSSMask) << MSSShift);
- else if (skb->ip_summed == CHECKSUM_PARTIAL) {
- const struct iphdr *ip = ip_hdr(skb);
- if (ip->protocol == IPPROTO_TCP)
- flags |= IPCS | TCPCS;
- else if (ip->protocol == IPPROTO_UDP)
- flags |= IPCS | UDPCS;
- else
- WARN_ON(1); /* we need a WARN() */
- }
+ opts1 |= eor | len | FirstFrag | LastFrag;
- txd->opts1 = cpu_to_le32(flags);
+ txd->opts1 = cpu_to_le32(opts1);
wmb();
cp->tx_skb[entry] = skb;
- entry = NEXT_TX(entry);
+ cp->tx_opts[entry] = opts1;
+ netif_dbg(cp, tx_queued, cp->dev, "tx queued, slot %d, skblen %d\n",
+ entry, skb->len);
} else {
struct cp_desc *txd;
- u32 first_len, first_eor;
+ u32 first_len, first_eor, ctrl;
dma_addr_t first_mapping;
int frag, first_entry = entry;
- const struct iphdr *ip = ip_hdr(skb);
/* We must give this initial chunk to the device last.
* Otherwise we could race with the device.
@@ -805,14 +811,14 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
goto out_dma_error;
cp->tx_skb[entry] = skb;
- entry = NEXT_TX(entry);
for (frag = 0; frag < skb_shinfo(skb)->nr_frags; frag++) {
const skb_frag_t *this_frag = &skb_shinfo(skb)->frags[frag];
u32 len;
- u32 ctrl;
dma_addr_t mapping;
+ entry = NEXT_TX(entry);
+
len = skb_frag_size(this_frag);
mapping = dma_map_single(&cp->pdev->dev,
skb_frag_address(this_frag),
@@ -824,19 +830,7 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
eor = (entry == (CP_TX_RING_SIZE - 1)) ? RingEnd : 0;
- ctrl = eor | len | DescOwn;
-
- if (mss)
- ctrl |= LargeSend |
- ((mss & MSSMask) << MSSShift);
- else if (skb->ip_summed == CHECKSUM_PARTIAL) {
- if (ip->protocol == IPPROTO_TCP)
- ctrl |= IPCS | TCPCS;
- else if (ip->protocol == IPPROTO_UDP)
- ctrl |= IPCS | UDPCS;
- else
- BUG();
- }
+ ctrl = opts1 | eor | len;
if (frag == skb_shinfo(skb)->nr_frags - 1)
ctrl |= LastFrag;
@@ -849,8 +843,8 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
txd->opts1 = cpu_to_le32(ctrl);
wmb();
+ cp->tx_opts[entry] = ctrl;
cp->tx_skb[entry] = skb;
- entry = NEXT_TX(entry);
}
txd = &cp->tx_ring[first_entry];
@@ -858,27 +852,17 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
txd->addr = cpu_to_le64(first_mapping);
wmb();
- if (skb->ip_summed == CHECKSUM_PARTIAL) {
- if (ip->protocol == IPPROTO_TCP)
- txd->opts1 = cpu_to_le32(first_eor | first_len |
- FirstFrag | DescOwn |
- IPCS | TCPCS);
- else if (ip->protocol == IPPROTO_UDP)
- txd->opts1 = cpu_to_le32(first_eor | first_len |
- FirstFrag | DescOwn |
- IPCS | UDPCS);
- else
- BUG();
- } else
- txd->opts1 = cpu_to_le32(first_eor | first_len |
- FirstFrag | DescOwn);
+ ctrl = opts1 | first_eor | first_len | FirstFrag;
+ txd->opts1 = cpu_to_le32(ctrl);
wmb();
+
+ cp->tx_opts[first_entry] = ctrl;
+ netif_dbg(cp, tx_queued, cp->dev, "tx queued, slots %d-%d, skblen %d\n",
+ first_entry, entry, skb->len);
}
- cp->tx_head = entry;
+ cp->tx_head = NEXT_TX(entry);
netdev_sent_queue(dev, skb->len);
- netif_dbg(cp, tx_queued, cp->dev, "tx queued, slot %d, skblen %d\n",
- entry, skb->len);
if (TX_BUFFS_AVAIL(cp) <= (MAX_SKB_FRAGS + 1))
netif_stop_queue(dev);
@@ -1115,6 +1099,7 @@ static int cp_init_rings (struct cp_private *cp)
{
memset(cp->tx_ring, 0, sizeof(struct cp_desc) * CP_TX_RING_SIZE);
cp->tx_ring[CP_TX_RING_SIZE - 1].opts1 = cpu_to_le32(RingEnd);
+ memset(cp->tx_opts, 0, sizeof(cp->tx_opts));
cp_init_rings_index(cp);
@@ -1151,7 +1136,7 @@ static void cp_clean_rings (struct cp_private *cp)
desc = cp->rx_ring + i;
dma_unmap_single(&cp->pdev->dev,le64_to_cpu(desc->addr),
cp->rx_buf_sz, PCI_DMA_FROMDEVICE);
- dev_kfree_skb(cp->rx_skb[i]);
+ dev_kfree_skb_any(cp->rx_skb[i]);
}
}
@@ -1164,7 +1149,7 @@ static void cp_clean_rings (struct cp_private *cp)
le32_to_cpu(desc->opts1) & 0xffff,
PCI_DMA_TODEVICE);
if (le32_to_cpu(desc->opts1) & LastFrag)
- dev_kfree_skb(skb);
+ dev_kfree_skb_any(skb);
cp->dev->stats.tx_dropped++;
}
}
@@ -1172,6 +1157,7 @@ static void cp_clean_rings (struct cp_private *cp)
memset(cp->rx_ring, 0, sizeof(struct cp_desc) * CP_RX_RING_SIZE);
memset(cp->tx_ring, 0, sizeof(struct cp_desc) * CP_TX_RING_SIZE);
+ memset(cp->tx_opts, 0, sizeof(cp->tx_opts));
memset(cp->rx_skb, 0, sizeof(struct sk_buff *) * CP_RX_RING_SIZE);
memset(cp->tx_skb, 0, sizeof(struct sk_buff *) * CP_TX_RING_SIZE);
@@ -1249,7 +1235,7 @@ static void cp_tx_timeout(struct net_device *dev)
{
struct cp_private *cp = netdev_priv(dev);
unsigned long flags;
- int rc;
+ int rc, i;
netdev_warn(dev, "Transmit timeout, status %2x %4x %4x %4x\n",
cpr8(Cmd), cpr16(CpCmd),
@@ -1257,13 +1243,26 @@ static void cp_tx_timeout(struct net_device *dev)
spin_lock_irqsave(&cp->lock, flags);
+ netif_dbg(cp, tx_err, cp->dev, "TX ring head %d tail %d desc %x\n",
+ cp->tx_head, cp->tx_tail, cpr16(TxDmaOkLowDesc));
+ for (i = 0; i < CP_TX_RING_SIZE; i++) {
+ netif_dbg(cp, tx_err, cp->dev,
+ "TX slot %d @%p: %08x (%08x) %08x %llx %p\n",
+ i, &cp->tx_ring[i], le32_to_cpu(cp->tx_ring[i].opts1),
+ cp->tx_opts[i], le32_to_cpu(cp->tx_ring[i].opts2),
+ le64_to_cpu(cp->tx_ring[i].addr),
+ cp->tx_skb[i]);
+ }
+
cp_stop_hw(cp);
cp_clean_rings(cp);
rc = cp_init_rings(cp);
cp_start_hw(cp);
- cp_enable_irq(cp);
+ __cp_set_rx_mode(dev);
+ cpw16_f(IntrMask, cp_norx_intr_mask);
netif_wake_queue(dev);
+ napi_schedule_irqoff(&cp->napi);
spin_unlock_irqrestore(&cp->lock, flags);
}

View file

@ -0,0 +1,105 @@
commit 8b7a7048220f86547db31de0abe1ea6dd2cfa892
Author: David Woodhouse <dwmw2@infradead.org>
Date: Thu Sep 24 11:38:22 2015 +0100
8139cp: Fix GSO MSS handling
When fixing the TSO support I noticed we just mask ->gso_size with the
MSSMask value and don't care about the consequences.
Provide a .ndo_features_check() method which drops the NETIF_F_TSO
feature for any skb which would exceed the maximum, and thus forces it
to be segmented by software.
Then we can stop the masking in cp_start_xmit(), and just WARN if the
maximum is exceeded, which should now never happen.
Finally, Francois Romieu noticed that we didn't even have the right
value for MSSMask anyway; it should be 0x7ff (11 bits) not 0xfff.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 5a58f227790faded5a3ef6075f3ddd65093e0f86
Author: David Woodhouse <David.Woodhouse@intel.com>
Date: Wed Sep 23 09:46:09 2015 +0100
8139cp: Enable offload features by default
I fixed TSO. Hardware checksum and scatter/gather also appear to be
working correctly both on real hardware and in QEMU's emulation.
Let's enable them by default and see if anyone screams...
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/drivers/net/ethernet/realtek/8139cp.c b/drivers/net/ethernet/realtek/8139cp.c
index 686334f..deae10d 100644
--- a/drivers/net/ethernet/realtek/8139cp.c
+++ b/drivers/net/ethernet/realtek/8139cp.c
@@ -175,7 +175,7 @@ enum {
LastFrag = (1 << 28), /* Final segment of a packet */
LargeSend = (1 << 27), /* TCP Large Send Offload (TSO) */
MSSShift = 16, /* MSS value position */
- MSSMask = 0xfff, /* MSS value: 11 bits */
+ MSSMask = 0x7ff, /* MSS value: 11 bits */
TxError = (1 << 23), /* Tx error summary */
RxError = (1 << 20), /* Rx error summary */
IPCS = (1 << 18), /* Calculate IP checksum */
@@ -754,10 +754,16 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
eor = (entry == (CP_TX_RING_SIZE - 1)) ? RingEnd : 0;
mss = skb_shinfo(skb)->gso_size;
+ if (mss > MSSMask) {
+ WARN_ONCE(1, "Net bug: GSO size %d too large for 8139CP\n",
+ mss);
+ goto out_dma_error;
+ }
+
opts2 = cpu_to_le32(cp_tx_vlan_tag(skb));
opts1 = DescOwn;
if (mss)
- opts1 |= LargeSend | ((mss & MSSMask) << MSSShift);
+ opts1 |= LargeSend | (mss << MSSShift);
else if (skb->ip_summed == CHECKSUM_PARTIAL) {
const struct iphdr *ip = ip_hdr(skb);
if (ip->protocol == IPPROTO_TCP)
@@ -1852,6 +1858,15 @@ static void cp_set_d3_state (struct cp_private *cp)
pci_set_power_state (cp->pdev, PCI_D3hot);
}
+static netdev_features_t cp_features_check(struct sk_buff *skb,
+ struct net_device *dev,
+ netdev_features_t features)
+{
+ if (skb_shinfo(skb)->gso_size > MSSMask)
+ features &= ~NETIF_F_TSO;
+
+ return vlan_features_check(skb, features);
+}
static const struct net_device_ops cp_netdev_ops = {
.ndo_open = cp_open,
.ndo_stop = cp_close,
@@ -1864,6 +1879,7 @@ static const struct net_device_ops cp_netdev_ops = {
.ndo_tx_timeout = cp_tx_timeout,
.ndo_set_features = cp_set_features,
.ndo_change_mtu = cp_change_mtu,
+ .ndo_features_check = cp_features_check,
#ifdef CONFIG_NET_POLL_CONTROLLER
.ndo_poll_controller = cp_poll_controller,
@@ -1983,12 +1999,12 @@ static int cp_init_one (struct pci_dev *pdev, const struct pci_device_id *ent)
dev->ethtool_ops = &cp_ethtool_ops;
dev->watchdog_timeo = TX_TIMEOUT;
- dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX;
+ dev->features |= NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO |
+ NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX;
if (pci_using_dac)
dev->features |= NETIF_F_HIGHDMA;
- /* disabled by default until verified */
dev->hw_features |= NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO |
NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX;
dev->vlan_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO |

View file

@ -0,0 +1,105 @@
commit 8b7a7048220f86547db31de0abe1ea6dd2cfa892
Author: David Woodhouse <dwmw2@infradead.org>
Date: Thu Sep 24 11:38:22 2015 +0100
8139cp: Fix GSO MSS handling
When fixing the TSO support I noticed we just mask ->gso_size with the
MSSMask value and don't care about the consequences.
Provide a .ndo_features_check() method which drops the NETIF_F_TSO
feature for any skb which would exceed the maximum, and thus forces it
to be segmented by software.
Then we can stop the masking in cp_start_xmit(), and just WARN if the
maximum is exceeded, which should now never happen.
Finally, Francois Romieu noticed that we didn't even have the right
value for MSSMask anyway; it should be 0x7ff (11 bits) not 0xfff.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 5a58f227790faded5a3ef6075f3ddd65093e0f86
Author: David Woodhouse <David.Woodhouse@intel.com>
Date: Wed Sep 23 09:46:09 2015 +0100
8139cp: Enable offload features by default
I fixed TSO. Hardware checksum and scatter/gather also appear to be
working correctly both on real hardware and in QEMU's emulation.
Let's enable them by default and see if anyone screams...
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/drivers/net/ethernet/realtek/8139cp.c b/drivers/net/ethernet/realtek/8139cp.c
index 686334f..deae10d 100644
--- a/drivers/net/ethernet/realtek/8139cp.c
+++ b/drivers/net/ethernet/realtek/8139cp.c
@@ -175,7 +175,7 @@ enum {
LastFrag = (1 << 28), /* Final segment of a packet */
LargeSend = (1 << 27), /* TCP Large Send Offload (TSO) */
MSSShift = 16, /* MSS value position */
- MSSMask = 0xfff, /* MSS value: 11 bits */
+ MSSMask = 0x7ff, /* MSS value: 11 bits */
TxError = (1 << 23), /* Tx error summary */
RxError = (1 << 20), /* Rx error summary */
IPCS = (1 << 18), /* Calculate IP checksum */
@@ -754,10 +754,16 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
eor = (entry == (CP_TX_RING_SIZE - 1)) ? RingEnd : 0;
mss = skb_shinfo(skb)->gso_size;
+ if (mss > MSSMask) {
+ WARN_ONCE(1, "Net bug: GSO size %d too large for 8139CP\n",
+ mss);
+ goto out_dma_error;
+ }
+
opts2 = cpu_to_le32(cp_tx_vlan_tag(skb));
opts1 = DescOwn;
if (mss)
- opts1 |= LargeSend | ((mss & MSSMask) << MSSShift);
+ opts1 |= LargeSend | (mss << MSSShift);
else if (skb->ip_summed == CHECKSUM_PARTIAL) {
const struct iphdr *ip = ip_hdr(skb);
if (ip->protocol == IPPROTO_TCP)
@@ -1852,6 +1858,15 @@ static void cp_set_d3_state (struct cp_private *cp)
pci_set_power_state (cp->pdev, PCI_D3hot);
}
+static netdev_features_t cp_features_check(struct sk_buff *skb,
+ struct net_device *dev,
+ netdev_features_t features)
+{
+ if (skb_shinfo(skb)->gso_size > MSSMask)
+ features &= ~NETIF_F_TSO;
+
+ return vlan_features_check(skb, features);
+}
static const struct net_device_ops cp_netdev_ops = {
.ndo_open = cp_open,
.ndo_stop = cp_close,
@@ -1864,6 +1879,7 @@ static const struct net_device_ops cp_netdev_ops = {
.ndo_tx_timeout = cp_tx_timeout,
.ndo_set_features = cp_set_features,
.ndo_change_mtu = cp_change_mtu,
+ .ndo_features_check = cp_features_check,
#ifdef CONFIG_NET_POLL_CONTROLLER
.ndo_poll_controller = cp_poll_controller,
@@ -1983,12 +1999,12 @@ static int cp_init_one (struct pci_dev *pdev, const struct pci_device_id *ent)
dev->ethtool_ops = &cp_ethtool_ops;
dev->watchdog_timeo = TX_TIMEOUT;
- dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX;
+ dev->features |= NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO |
+ NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX;
if (pci_using_dac)
dev->features |= NETIF_F_HIGHDMA;
- /* disabled by default until verified */
dev->hw_features |= NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO |
NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX;
dev->vlan_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO |